3.5 External Data Sources
There are many open source data sets that are valuable for benchmarking and enriching student data. These data sets continue to grow and can be incorporated into the LSDS or other data for analysis. Often times they will require you to get an API key. Be sure to be a good steward of the internet and apply good scraping practices (see here).
3.5.1 IPEDS Data Center
The IPEDS Data Center has a rich assortment of data. You can access it through reports that you build OR download the access databases and explore them yourself.
3.5.2 NCSES Elementary and Secondary School Information
Each year Public Schools must turn in a government supplied survey about the school performance. This data contains many useful features like the number of students by race and gender as well as enrollment year over year. Additionally, this survey is sent to Private Schools and their data, while less complete, is also available at the below link.
3.5.3 National Student Clearing House
The National Student Clearing House contains enrollment data for 98% of students in higher education. Ask the National Student Clearing House keyholder to add you to the list of users. You can search each student and their subsequent enrollments using this tool.
I have created a template available at \\admin2\instres\IR Shared Folder for Retention Analysis\adhoc analyses\national_clearing_house\create_list_for_clearinghouse.sas
that will create a properly formatted document for submission.
3.5.4 US Census
The US Census is a great data source.
This data is good to understand neighborhood demographics of each student.
I often use the tidycensus
package in R to access and save these data.
As a reminder, accessing this data through the tidycensus
package requires a valid API key from the US Census.