Skip to main content

Predictive Analysis

Predictive Analytics of Institutional Research
Machine Learning

Machine Learning comprises a set of cutting edge computational tools for data-driven decision making. While machine learning has been widely adopted in industry, it has yet to become a staple of institutional research. Machine learning model predicts retention rate, graduation rate, and years needed to graduate to demonstrate that machine learning can provide insight into important issues in higher education. The model predicts the likelihood a student will drop out, graduate, or graduate in how many years with around 85% accuracy. The predictive strength of the model, especially in regards to retention and graduation, yields a powerful tool for both assessing and developing interventions to improve student success.

Neural Networks and Logistic Regression models

Using Neural Network and Logistic Regression Model, we can iteratively combine features such as age, gender, ethnicity, students with military status, and students admitted with transferred units to see which factor(s) will yield the best prediction rate.

Results

  • When one single feature(factor) is used as the input to the model, we found gender and age were the most effective features in prediction.
  • Using all feature, i.e., age, gender, ethnicity, students with military status, and students admitted with transferred units, as inputs does not yield a better prediction result. Some of the features(factors) probably distract the model in the training process.
  • For Logistic Regression model, the following were used to train and validate the results.
    • 5-fold cross validation
    • K-Folds Shuffled
    • Limited-memory BFGS (LBFGS) Solver
    • Max Iterations = 1000
  • For Neural Networks model, the following were used to train and validate the results.
    • Activation function used: RELU
    • 5-fold cross validation
    • 4 Layers Setup
    • Input Layer:n Neuron, where n equals the total number of factors used 
    • 2 Hidden Layers with both hidden layer containing 10 Neurons each
    • Output Layer with 1 Neuron