Predicting Students' Academic Performance Based on Social and Academic Factors Using Data Mining Techniques
Keywords:
data mining, academic performance, classification, logistic regression, class imbalanceAbstract
This study aims to predict students’ academic performance based on social and academic factors using data mining techniques. The dataset contains 6,607 student records with 20 attributes. Three classification algorithms were applied: Random Forest, Logistic Regression, and Support Vector Machine. The results show that Logistic Regression achieved the highest accuracy at 98.18%. The most influential factors on exam scores were attendance and study hours. However, class imbalance caused all models to struggle in predicting the "High" performance category. These findings are useful for supporting data-driven educational interventions and highlight the need to address imbalanced data distribution in modeling.