Predicting student performance on statewide exams using machine learning

Last updated on Aug 5, 2022

For my final project in Applied Machine Learning for Educational Data Science, I worked with a team to develop a predictive model for student performance on statewide exams of reading and math. Data was pulled from outside sources (e.g., the National Center for Educational Statistics) and compiled to train and tune our predictive model. Information regarding the Kaggle competition can be found here. Among the various supervised learning techniques we learned using tidymodels, our team decided to try 1) linear modeling, 2) random forest model, and 3) boosted tree models. Our team, the Tensor Flo Ridas, took second place! You can find our code and a walkthrough of our models here.

Data science

Predicting student performance on statewide exams using machine learning

Lea E. Frank