Difference between revisions of "Latino/Latina/Latinx/Hispanic Learners in North America"

From Penn Center for Learning Analytics Wiki
Jump to navigation Jump to search
(Added Jeong et al (2022))
(22 intermediate revisions by 3 users not shown)
Line 3: Line 3:
* False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used
* False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used
* White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD
* White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD
Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf]
* Models predicting student's high school dropout
* The decision trees showed little difference in AUC among Hispanic, White, Black, Asian, American Indian and Alaska Native, and  Native Hawaiian and Pacific Islander.
Lee and Kizilcec (2020) [https://arxiv.org/pdf/2007.00088.pdf pdf]
* Models predicting college success (or median grade or above)
* Random forest algorithms performed significantly worse for underrepresented minority students (URM; Hispanic, American Indian, Black, Hawaiian or Pacific Islander, and Multicultural) than non-URM students (White and Asian)
* The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values
Yu et al. (2020) [https://files.eric.ed.gov/fulltext/ED608066.pdf pdf]
* Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
* Hispanic students were inaccurately predicted to perform worse for both short-term and long-term
* The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model
Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf]
* Models predicting college dropout for students in residential and fully online program
* Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
* The model showed better recall for URM students, whether they were in residential or online program
Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring page]
* Automated scoring models for evaluating English essays, or e-rater
* E-Rater gave significantly better scores than human rater for 11th grade essays written by Hispanic students and Asian-American students
Jiang & Pardos (2021) [https://dl.acm.org/doi/pdf/10.1145/3461702.3462623 pdf]
* Predicting university course grades using LSTM
* Roughly equal accuracy across racial groups
* Slightly better accuracy (~1%) across racial groups when including race in model
Zhang et al. (in press) [https://www.upenn.edu/learninganalytics/ryanbaker/EDM22_paper_35.pdf pdf]
* Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
* For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
* No racial/ethnic group consistently had best-performing detectors
Kung & Yu (2020)
[https://dl.acm.org/doi/pdf/10.1145/3386527.3406755 pdf]
* Predicting course grades and later GPA at public U.S. university
* Poorer independence, separation, sufficiency for Latinx students than white students for five different classic machine learning algorithms
Jeong et al. (2022) [https://fated2022.github.io/assets/pdf/FATED-2022_paper_Jeong_Racial_Bias_ML_Algs.pdf]
* Predicting 9th grade math score from academic performance, surveys, and demographic information
* Despite comparable accuracy, model tends to underpredict Hispanic students' performance
* Several fairness correction methods equalize false positive and false negative rates across groups.

Revision as of 16:05, 4 August 2022

Anderson et al. (2019) pdf

  • Models predicting six-year college graduation
  • False negatives rates were greater for Latino students when Decision Tree and Random Forest yielded was used
  • White students had higher false positive rates across all models, Decision Tree, SVM, Logistic Regression, Random Forest, and SGD


Christie et al. (2019) pdf

  • Models predicting student's high school dropout
  • The decision trees showed little difference in AUC among Hispanic, White, Black, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.


Lee and Kizilcec (2020) pdf

  • Models predicting college success (or median grade or above)
  • Random forest algorithms performed significantly worse for underrepresented minority students (URM; Hispanic, American Indian, Black, Hawaiian or Pacific Islander, and Multicultural) than non-URM students (White and Asian)
  • The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values


Yu et al. (2020) pdf

  • Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
  • Hispanic students were inaccurately predicted to perform worse for both short-term and long-term
  • The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model


Yu et al. (2021) pdf

  • Models predicting college dropout for students in residential and fully online program
  • Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
  • The model showed better recall for URM students, whether they were in residential or online program


Bridgeman et al. (2009) page

  • Automated scoring models for evaluating English essays, or e-rater
  • E-Rater gave significantly better scores than human rater for 11th grade essays written by Hispanic students and Asian-American students


Jiang & Pardos (2021) pdf

  • Predicting university course grades using LSTM
  • Roughly equal accuracy across racial groups
  • Slightly better accuracy (~1%) across racial groups when including race in model


Zhang et al. (in press) pdf

  • Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
  • For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
  • No racial/ethnic group consistently had best-performing detectors


Kung & Yu (2020) pdf

  • Predicting course grades and later GPA at public U.S. university
  • Poorer independence, separation, sufficiency for Latinx students than white students for five different classic machine learning algorithms


Jeong et al. (2022) [1]

  • Predicting 9th grade math score from academic performance, surveys, and demographic information
  • Despite comparable accuracy, model tends to underpredict Hispanic students' performance
  • Several fairness correction methods equalize false positive and false negative rates across groups.