14.8 Model Scoring
Because of the rare event problem and in order to compare the outcomes from the frequentist model with the Bayesian models I use a unique scoring method.
- Each student in the validation data set is given a probability of leaving Wake Forest (by each model)
- The data are then sorted from highest predicted probability to lowest predicted probability
- That probability is the converted to a number from 1 to n (the number of students in the validation data set) to represent the ranked probability (e.g. the student with the highest predicted probability of leaving for model a would get a “1” while the student with the lowest predicted probability would get a rank of “n”).
- This is repeated for each model.
- Here the model performance is determined by counting the number of students who went on to leave by decile (e.g. 7 students in the top 100 for model a would get a score of 7, etc).
- The more students that actually left that are captured in the top 100 students of the model represents a better performing model
This approach also allows you to see how your model performs in the lower deciles and see if there is an opportunity to improve.