Difference between revisions of "At-risk/Dropout/Stopout/Graduation Prediction"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
(added Sha et al 2022)
(5 intermediate revisions by 2 users not shown)
Line 41: Line 41:
Kai et al. (2017) [https://files.eric.ed.gov/fulltext/ED596601.pdf pdf]
Kai et al. (2017) [https://files.eric.ed.gov/fulltext/ED596601.pdf pdf]
* Models predicting student retention in an online college program
* Models predicting student retention in an online college program
* J-48 decision trees achieved much higher Kappa and AUC for students whose parents did not attend college than those whose parents did
* J-48 decision trees achieved much higher Kappa and AUC for students whose parents did not attend college than those whose parents did
* J-Rip decision rules  achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did
* J-Rip decision rules  achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did


Line 47: Line 47:
Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf]
Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf]
* Models predicting college dropout for students in residential and fully online program
* Models predicting college dropout for students in residential and fully online program
* Whether the protected attributed were included or not, the models had worse true negative rates and recall for underrepresented minority (URM) students and for male students in residential and online programs.
* The model showed better recall for students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
* The model was less accurate for URM students studying in residential program.
* Whether the socio-demographic information was included or not, the model showed worse accuracy and true negative rates for residential students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
* The model was worse for male students studying in online program in terms of true negative rates, recall and accuracy.
* Both accuracy and true negative rates were better for students who are first-generation, or with greater financial needs


* Models for first-generation residential students showed worse accuracy and true negative rate (i.e., predicting power of sophomore year persistence on college persistence)
Verdugo et al. (2022) [https://dl.acm.org/doi/abs/10.1145/3506860.3506902 pdf]
* Models for first-generation residential students showed significantly better recall (i.e., proportion of correctly identified dropouts) than online peers, whether the attribute were made aware or not
* An algorithm predicting dropout from university after the first year
* Model performed with significantly lower accuracy and true negative rate for residential students with greater financial need than online counterparts, whether the attribute were made aware or not
* Several algorithms achieved better AUC and F1 for students who attended public high schools than for students who attended private high schools.
* Models for first-generation residential students showed significantly better recall (i.e., proportion of correctly identified dropouts) than online peers, whether their status were included or not
* Several algorithms predicted better AUC for male students than female students; F1 scores were more balanced.
 
Sha et al. (2022) [https://ieeexplore.ieee.org/abstract/document/9849852]
* Predicting dropout in XuetangX platform using neural network
* A range of over-sampling methods tested
* Regardless of over-sampling method used, dropout performance was slightly better for males.

Revision as of 16:25, 31 August 2022

Kai et al. (2017) pdf

  • Models predicting student retention in an online college program
  • J48 decision trees achieved much lower Kappa and AUC for Black students than White students
  • J48 decision trees achieved significantly lower Kappa but higher AUC for male students than female students
  • JRip decision rules achieved almost identical Kappa and AUC for Black students and White students
  • JRip decision trees achieved much lower Kappa and AUC for male students than female students


Hu and Rangwala (2020) pdf

  • Models predicting if a college student will fail in a course
  • Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
  • The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science
  • Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against male students, performing particularly better for Psychology course.
  • Other models (Logistic Regression and Rawlsian Fairness) performed far worse for male students, performing particularly worse in Computer Science and Electrical Engineering.


Anderson et al. (2019) pdf

  • Models predicting six-year college graduation
  • False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used
  • White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD
  • False negatives rates were greater for male students than female students when SVM, Logistic Regression, and SGD were used


Christie et al. (2019) pdf

  • Models predicting student's high school dropout
  • The decision trees showed little difference in AUC among White, Black, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.
  • The decision trees showed very minor differences in AUC between female and male students


Gardner, Brooks and Baker (2019) [pdf]

  • Model predicting MOOC dropout, specifically through slicing analysis
  • Some algorithms performed worse for female students than male students, particularly in courses with 45% or less male presence


Baker et al. (2020) [pdf]

  • Model predicting student graduation and SAT scores for military-connected students
  • For prediction of graduation, algorithms applying across population resulted an AUC of 0.60, degrading from their original performance of 70% or 71% to chance.
  • For prediction of SAT scores, algorithms applying across population resulted in a Spearman's ρ of 0.42 and 0.44, degrading a third from their original performance to chance.


Kai et al. (2017) pdf

  • Models predicting student retention in an online college program
  • J-48 decision trees achieved much higher Kappa and AUC for students whose parents did not attend college than those whose parents did
  • J-Rip decision rules achieved much higher Kappa and AUC for students whose parents did not attended college than those whose parents did


Yu et al. (2021) pdf

  • Models predicting college dropout for students in residential and fully online program
  • The model showed better recall for students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
  • Whether the socio-demographic information was included or not, the model showed worse accuracy and true negative rates for residential students who are under-represented minority (URM; not White or Asian), male, first-generation, or with greater financial needs
  • Both accuracy and true negative rates were better for students who are first-generation, or with greater financial needs

Verdugo et al. (2022) pdf

  • An algorithm predicting dropout from university after the first year
  • Several algorithms achieved better AUC and F1 for students who attended public high schools than for students who attended private high schools.
  • Several algorithms predicted better AUC for male students than female students; F1 scores were more balanced.

Sha et al. (2022) [1]

  • Predicting dropout in XuetangX platform using neural network
  • A range of over-sampling methods tested
  • Regardless of over-sampling method used, dropout performance was slightly better for males.