Difference between revisions of "Black/African-American Learners in North America"

Latest revision as of 21:01, 28 June 2023

Kai et al. (2017) pdf

Models predicting student retention in an online college program
J48 decision trees achieved much lower Kappa and AUC for Black students than White students
JRip decision rules achieved almost identical Kappa and AUC for Black students and White students

Hu and Rangwala (2020) pdf

Models predicting if a college student will fail in a course
Multiple cooperative classifier model (MCCM) model was the best at reducing bias, or discrimination against African-American students, while other models (particularly Logistic Regression and Rawlsian Fairness) performed far worse
The level of bias was inconsistent across courses, with MCCM prediction showing the least bias for Psychology and the greatest bias for Computer Science

Christie et al. (2019) pdf

Models predicting student's high school dropout
The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and Native Hawaiian and Pacific Islander.

Lee and Kizilcec (2020) pdf

Models predicting college success (or median grade or above)
Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values

Yu et al. (2020) pdf

Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
Black students were inaccurately predicted to perform worse for both short-term and long-term
The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model

Yu et al. (2021) pdf

Models predicting college dropout for students in residential and fully online program
Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
The model showed better recall for URM students, whether they were in residential or online program

Ramineni & Williamson (2018) pdf

Revised automated scoring engine for assessing GRE essay
E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization

Bridgeman et al. (2009) pdf

Automated scoring models for evaluating English essays, or e-rater
The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students

Bridgeman et al. (2012) pdf

A later version of automated scoring models for evaluating English essays, or e-rater
E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE

Jiang & Pardos (2021) pdf

Predicting university course grades using LSTM
Roughly equal accuracy across racial groups
Slightly better accuracy (~1%) across racial groups when including race in model

Zhang et al. (2022) pdf

Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
No racial/ethnic group consistently had best-performing detectors

Li, Xing, & Leite (2022) pdf

Models predicting whether two students will communicate on an online discussion forum
Compared members of overrepresented racial groups to members of underrepresented racial groups (over 2/3

Black/African American)

Multiple fairness approaches lead to ABROCA of under 0.01 for overrepresented versus underrepresented students

Litman et al. (2021) html

Automated essay scoring models inferring text evidence usage
All algorithms studied have less than 1% of error explained by whether student is Black

Jeong et al. (2022) [1]

Predicting 9th grade math score from academic performance, surveys, and demographic information
Despite comparable accuracy, model tends to underpredict Black students' performance
Several fairness correction methods equalize false positive and false negative rates across groups.

Zhang et al.(2023) pdf

Models developed to detect attributes of student feedback for other students’ mathematics solutions, reflecting the presence of three constructs:1) commenting on process, 2) commenting on the answer, and 3) relating to self.
Models have approximately equal performance for African American, Hispanic/Latinx, and White students.

@@ Line 11: / Line 11: @@
-Lee and Kizilcec (2020) [[https://arxiv.org/pdf/2007.00088.pdf pdf]]
+Christie et al. (2019) [https://files.eric.ed.gov/fulltext/ED599217.pdf pdf]
+* Models predicting student's high school dropout
+* The decision trees showed little difference in AUC among Black, White, Hispanic, Asian, American Indian and Alaska Native, and  Native Hawaiian and Pacific Islander.
+Lee and Kizilcec (2020) [https://arxiv.org/pdf/2007.00088.pdf pdf]
 * Models predicting college success (or median grade or above)
-* Random forest algorithms performed significantly worse for underrepresented minority students (URM; American Indian, Black, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
+* Random forest algorithms performed significantly worse for underrepresented minority students (URM; Black, American Indian, Hawaiian or Pacific Islander, Hispanic, and Multicultural) than non-URM students (White and Asian)
-* The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values
+* The fairness of the model, namely demographic parity and equality of opportunity, as well as its accuracy, improved after correcting the threshold values from 0.5 to group-specific values
+Yu et al. (2020) [https://files.eric.ed.gov/fulltext/ED608066.pdf pdf]
+* Model predicting undergraduate short-term (course grades) and long-term (average GPA) success
+* Black students were inaccurately predicted to perform worse for both short-term and long-term
+* The fairness of models improved when either click or a combination of click and survey data, and not institutional data, was included in the model
+Yu et al. (2021) [https://dl.acm.org/doi/pdf/10.1145/3430895.3460139 pdf]
+* Models predicting college dropout for students in residential and fully online program
+* Whether the socio-demographic information was included or not, the model showed worse true negative rates for students who are underrepresented minority (URM; or not White or Asian), and worse accuracy if URM students are studying in person
+* The model showed better recall for URM students, whether they were in residential or online program
+Ramineni & Williamson (2018) [https://files.eric.ed.gov/fulltext/EJ1202928.pdf pdf]
+* Revised automated scoring engine for assessing GRE essay
+* E-rater gave African American test-takers significantly lower scores than human raters when assessing their written responses to argument prompts
+* The shorter essays written by African American test-takers were more likely to receive lower scores as showing weakness in content and organization
+Bridgeman et al. (2009) [https://www.researchgate.net/publication/242203403_Considering_Fairness_and_Validity_in_Evaluating_Automated_Scoring pdf]
+* Automated scoring models for evaluating English essays, or e-rater
+* The score difference between human rater and e-rater was significantly smaller for 11th grade essays written by African American and White students
+Bridgeman et al. (2012) [https://www.tandfonline.com/doi/pdf/10.1080/08957347.2012.635502 pdf]
+* A later version of automated scoring models for evaluating English essays, or e-rater
+* E-rater gave significantly lower score than human rater when assessing African-American students’ written responses to issue prompt in GRE
+Jiang & Pardos (2021) [https://dl.acm.org/doi/pdf/10.1145/3461702.3462623 pdf]
+* Predicting university course grades using LSTM
+* Roughly equal accuracy across racial groups
+* Slightly better accuracy (~1%) across racial groups when including race in model
+Zhang et al. (2022) [https://www.upenn.edu/learninganalytics/ryanbaker/EDM22_paper_35.pdf pdf]
+* Detecting student use of self-regulated learning (SRL) in mathematical problem-solving process
+* For each SRL-related detector, relatively small differences in AUC were observed across racial/ethnic groups.
+* No racial/ethnic group consistently had best-performing detectors
+Li, Xing, & Leite (2022) [https://dl.acm.org/doi/pdf/10.1145/3506860.3506869?casa_token=OZmlaKB9XacAAAAA:2Bm5XYi8wh4riSmEigbHW_1bWJg0zeYqcGHkvfXyrrx_h1YUdnsLE2qOoj4aQRRBrE4VZjPrGw pdf]
+* Models predicting whether two students will communicate on an online discussion forum
+* Compared members of overrepresented racial groups to members of underrepresented racial groups (over 2/3
+Black/African American)
+* Multiple fairness approaches lead to ABROCA of under 0.01 for overrepresented versus underrepresented students
+Litman et al. (2021) [https://link.springer.com/chapter/10.1007/978-3-030-78292-4_21 html]
+* Automated essay scoring models inferring text evidence usage
+* All algorithms studied have less than 1% of error explained by whether student is Black
+Jeong et al. (2022) [https://fated2022.github.io/assets/pdf/FATED-2022_paper_Jeong_Racial_Bias_ML_Algs.pdf]
+* Predicting 9th grade math score from academic performance, surveys, and demographic information
+* Despite comparable accuracy, model tends to underpredict Black students' performance
+* Several fairness correction methods equalize false positive and false negative rates across groups.
-Ramineni & Williamson (2018) [[https://onlinelibrary.wiley.com/doi/10.1002/ets2.12192 pdf]]
+Zhang et al.(2023) [https://learninganalytics.upenn.edu/ryanbaker/ISLS23_annotation%20detector_short_submit.pdf pdf]
-* Revised automated scoring engine for assessing GSE essay
+* Models developed to detect attributes of student feedback for other students’ mathematics solutions, reflecting the presence of three constructs:1) commenting on process, 2) commenting on the answer, and 3) relating to self.
-* Relative weakness in content and organization by African American test takers resulted in lower scores than Chinese peers who wrote longer.
+* Models have approximately equal performance for African American, Hispanic/Latinx, and White students.

Difference between revisions of "Black/African-American Learners in North America"

Latest revision as of 21:01, 28 June 2023

Navigation menu

Search