Difference between revisions of "At-risk/Dropout/Stopout/Graduation Prediction"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
(Rewrote Queiroga)
Line 61: Line 61:
* Regardless of over-sampling method used, dropout performance was slightly better for males.
* Regardless of over-sampling method used, dropout performance was slightly better for males.


Queiroga et al. (2022) [https://doi.org/10.3390/info13090401 pdf]
Queiroga et al. (2022) [https://www.mdpi.com/2078-2489/13/9/401 pdf]
* Models predicting secondary school students at risk of failure or dropping out.
 
* Models achieved high performances with an AUROC higher than 0.90 and F1-Macro higher than 0.88.
* Models predicting secondary school students at risk of failure or dropping out
* Models achieve better results when new data comes from the secondary education period (e.g., model M2G1-UTU achieved a performance of 95%).
* Model was unable to make prediction of student success (F1 score = 0.0) for students not in a social welfare program (higher socioeconomic status)
* First-year primary school zones (rural or urban) and sixth year assessment-based grouping are two of the most important attributes of this model.
* Model had slightly lower AUC ROC (0.52 instead of 0.56) for students not in a social welfare program (higher socioeconomic status)

Revision as of 11:37, 4 June 2023

Kai et al. (2017) pdf

  • Models predicting student retention in an online college program
  • J48 decision trees achieved much lower Kappa and AUC for Black students than White students
  • J48 decision trees achieved significantly lower Kappa but higher AUC for male students than female students
  • JRip decision rules achieved almost identical Kappa and AUC for Black students and White students
  • JRip decision trees achieved much lower Kappa and AUC for male students than female students

Hu and Rangwala (2020) pdf

  • Models predicting if a college student will fail in a course
  • Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
  • The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science
  • Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against male students, performing particularly better for Psychology course.
  • Other models (Logistic Regression and Rawlsian Fairness) performed far worse for male students, performing particularly worse in Computer Science and Electrical Engineering.


Anderson et al. (2019) pdf

  • Models predicting six-year college graduation
  • False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used
  • White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD
  • False negatives rates were greater for male students than female students when SVM, Logistic Regression, and SGD were used


Christie et al. (2019) pdf

  • Models predicting student's high school dropout
  • The decision trees showed little difference in AUC among White, Black, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.
  • The decision trees showed very minor differences in AUC between female and male students


Gardner, Brooks and Baker (2019) [pdf]

  • Model predicting MOOC dropout, specifically through slicing analysis
  • Some algorithms performed worse for female students than male students, particularly in courses with 45% or less male presence


Baker et al. (2020) [pdf]

  • Model predicting student graduation and SAT scores for military-connected students
  • For prediction of graduation, algorithms applying across population resulted an AUC of 0.60, degrading from their original performance of 70% or 71% to chance.
  • For prediction of SAT scores, algorithms applying across population resulted in a Spearman's ρ of 0.42 and 0.44, degrading a third from their original performance to chance.


Kai et al. (2017) pdf

  • Models predicting student retention in an online college program
  • J-48 decision trees achieved much higher Kappa and AUC for students whose parents did not attend college than those whose parents did
  • J-Rip decision rules achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did


Yu et al. (2021) pdf

  • Models predicting college dropout for students in residential and fully online program
  • The model showed better recall for students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
  • Whether the socio-demographic information was included or not, the model showed worse accuracy and true negative rates for residential students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
  • Both accuracy and true negative rates were better for students who are first-generation, or with greater financial needs

Verdugo et al. (2022) pdf

  • An algorithm predicting dropout from university after the first year
  • Several algorithms achieved better AUC and F1 for students who attended public high schools than for students who attended private high schools.
  • Several algorithms predicted better AUC for male students than female students; F1 scores were more balanced.

Sha et al. (2022) [1]

  • Predicting dropout in XuetangX platform using neural network
  • A range of over-sampling methods tested
  • Regardless of over-sampling method used, dropout performance was slightly better for males.

Queiroga et al. (2022) pdf

  • Models predicting secondary school students at risk of failure or dropping out
  • Model was unable to make prediction of student success (F1 score = 0.0) for students not in a social welfare program (higher socioeconomic status)
  • Model had slightly lower AUC ROC (0.52 instead of 0.56) for students not in a social welfare program (higher socioeconomic status)