Getting Started with the Usual Suspects

The Usual Suspects refers to a relatively standard reporting framework for analysis from the Wake Forest University Office of Institutional Research (OIR). Most programs are interested in who takes/participates/enrolled vs those who do not crosstabulated among several key demographic procedures. This package then seeks to make the analysis as simple as possible with a common file structure.

Several key packages are needed for the project to work:

project_required_libraries %>% 
  knitr::kable(caption = "Required Libraries")
Required Libraries
library location
tidyverse CRAN
tidybayes CRAN
cowplot CRAN
fs CRAN
latex2exp CRAN
bookdown CRAN
rmarkdown CRAN
brms CRAN
broom CRAN
kableExtra CRAN
irtools WFU-OIR
wfudata WFU-OIR

Directory Structure

There is a common file structure for each project that takes the following form:

File Structure
item_name item_type item_purpose
data directory Store derived data
data-raw directory Store raw data
munge directory Cleaning scripts which create the cleaned data objects stored in the data folder
outputs directory This is where output figures and models should be placed
report directory This is where the reports should be stored
src directory This is where analysis scripts should be stored
libs directory This is to store any helper functions that you create during the analysis
makefile makefile A GNU makefile which builds the analysis
README.Md markdown file A README file that describes the project origin and important features

Again, with a sub-folder structure that looks like the following:

File Structure
item_name item_type parent_folder item_purpose
models directory outputs Store derived models from the analysis
to-be-reviewed directory report Store files for editing
released directory report Store the released copies of the reports

Additionally, there are common files that have common functions:

File Structure
item_name parent_folder item_purpose
01-import.R munge Import data from stakeholder and combines with the LSDS
02-match.R munge Runs a matching algorithm for causal inference if desired
01-run-analysis.R src Runs several regression analysis and saves the coefficients
02-clean-analysis.R src Cleans up the regression analysis for graphing and the final report

Making it Happen

Create A Project in a New Directory

First you will need to create a new Rproject using the usual methods. For the project name I typically write us-NAME-Of-Analysis. For example if I were working on a project for the Art department, I would name it us-arts. If I were working on something for the Call to Conversation, I would call it us-call-to-conversation. This way you can quickly see what project is using the usualsuspects template.

Make New Rmarkdown Document from Template

Once you have established that you can create a new Usual Subjects Project by first going to “File” -> “New R Markdown Document” -> “From Template” and choose the usual suspects template.

Create a new R Markdown document from template

Create a new R Markdown document from template

Go ahead and rename the folder “report”.

Select `usualsuspects` template and rename the name field to `report`

Select usualsuspects template and rename the name field to report

This will generate a new folder with the associated template and supporting files for your usualsuspects report.

`usualsuspects` report template files

usualsuspects report template files

Generating the Remainder of the Usual Suspects Project Template

Then in the console write the following and then press enter:

This will build out the template for all other files.

Now Build Do Your Analysis

Import/ Munge

First, add your data from the requester into the data-raw folder. Use the 01-import.R script as a template and do whatever data importing and munging is required. If you need to perform some matching, please verify that 02-match.R writes out a file to disk.

Check the Analysis

Now check to make sure that the code is modeling all of the outcome parameters desired in the 01-run-analysis.R file.

Verify the Parameters in the RMarkdown Document

The usualsuspects template utilises a parameterised Rmarkdown report which allows you to set some variables that carry forward for the rest of the report. Below is an example of the parameters used in a report. It includes some information about the names for the treatment and control groups, the language to use when describing the differences between the two groups and some additional features.

---
title: "Review of _SOMETHING_"
subtitle: "Descriptive Statistics of Demographics"
author: "Michael DeWitt _Office of Institutional Research_"
date: "TODAY (Updated: 2019-07-30)"
toc: false
output: bookdown::pdf_document2
bibliography: my_bib.bib
params:
  treated_group_name: "Subscribed"
  control_group_name: "Did Not Subscribe"
  between_language: "those students whose parents/guardians subscribed to the _Daily Deac_ blog and those who did not"
  regression_analysis: "First Year"
  causal: FALSE
  clear_log: TRUE
  draft: FALSE
  nc_region: FALSE
  hdi_level: "95%"
  demographics: TRUE
  regression_analysis: TRUE
header-includes:
   - \usepackage{eso-pic,graphicx,transparent, float}
---
  • treated_group_name: The name to be used for the group of interest
  • control_group_name: The name to be used for the rest of the population that did not attend/ receive the intervention
  • between_language: this is the language that is used to describe the differences between the two groups
  • regression_analysis: note used, yet
  • causal: if true this will print a warning about assigning causality
  • clean_log: should be left as TRUE
  • draft: If TRUE then a DRAFT watermark will be printed on all pages
  • nc_region: If TRUE then students from NC will be broken out into a separate category when Census region of origin is examined
  • hdi_level: Highest Density Interval/ Confidence interval to be considered
  • demographics: TRUE/FALSE to complete the demographic analysis crosstabs
  • regression_analysis: TRUE/FALSE for if you print the regression analysis

Then make

After most of the code is written, go to the terminal and type make. This will begin to execute a series of operations that will run all of the R scripts in order, generate the Rmarkdown report as a pdf, sort the results into a file located in report/log_file.txt and then move a draft into the drafts folder.

You can continue to type make into the console whenever needed and only those files you have modified and their dependencies will be re-run.