4.16 Combining the Data and Generating the LSDS

4.16.1 Combining

After each of the individual programs has been run, the individual data sets are first combined into a single data set. This is accomplished by the Building FR cohort file_4_Xcombining_dev.sas program. This program takes inputs from all of the different data sets starting with the cohort information and “left joins” from that point.

Additionally, this step create a “wide” data set. This is done through the use of several SAS macros named:

  • chelsea which checks if a numeric value is missing, then replace with missing indicator with a 0
  • wfu which creates 21 placeholders for character data
  • penn which creates 21 placeholders for numeric data

Several array operations are then performed to make a “wide” data set with one row per student, with each term variable represented by a single column (e.g. IM_GAMES_18 represents the number of intramural games a student played in their 18th term at Wake Forest). Currently, the maximum student in the LSDS has been at Wake Forest for 21 terms (Fall - Spring - Summer).

This step also created a “long” data set which will be used later.

The combining step is an extremely important operation! As such it is important to verify that the number of records makes sense. Additionally, many array operations are taking place. If a student has been at Wake Forest for more terms than you have specified as the maximum (today 21), then the program will fail silently. Because of this, it is critically important to check this operation.

4.16.2 Calculating Retention

The next step in updating the LSDS is running the Building FR cohort file_4_XRetention.sas. This step will calculate those students who were retained and flag those students who left, both those who left in their first year at Wake Forest and those who left at any other time without taking a degree. It is important to refresh the “term_of_interest” variable in the program to the census date of interest.

The current retention calculation considers a student as having left Wake Forest if:

  • the student is not present on the census date of the term of interest AND
  • student is not on continuous enrollment

This program will also set several flags for items of interested (e.g. “ever_grk” flag for if a student was ever an active member in a Greek community). It also calculates the four, six, and eight year graduation rate flags.

This program requires that the user update the term_of_interest variable AND verify that all array limits are identical to those set in the combination program. Otherwise there will be a silent failure.