14.8 Model Scoring

Because of the rare event problem and in order to compare the outcomes from the frequentist model with the Bayesian models I use a unique scoring method.

  1. Each student in the validation data set is given a probability of leaving Wake Forest (by each model)
  2. The data are then sorted from highest predicted probability to lowest predicted probability
  3. That probability is the converted to a number from 1 to n (the number of students in the validation data set) to represent the ranked probability (e.g. the student with the highest predicted probability of leaving for model a would get a “1” while the student with the lowest predicted probability would get a rank of “n”).
  4. This is repeated for each model.
  5. Here the model performance is determined by counting the number of students who went on to leave by decile (e.g. 7 students in the top 100 for model a would get a score of 7, etc).
  6. The more students that actually left that are captured in the top 100 students of the model represents a better performing model

This approach also allows you to see how your model performs in the lower deciles and see if there is an opportunity to improve.