Getting Started

Getting Started with the Usual Suspects

library(usualsuspects)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

The Usual Suspects refers to a relatively standard reporting framework for analysis from the Wake Forest University Office of Institutional Research (OIR). Most programs are interested in who takes/participates/enrolled vs those who do not crosstabulated among several key demographic procedures. This package then seeks to make the analysis as simple as possible with a common file structure.

Several key packages are needed for the project to work:

project_required_libraries %>% 
  knitr::kable(caption = "Required Libraries")

Required Libraries
library	location
tidyverse	CRAN
tidybayes	CRAN
cowplot	CRAN
fs	CRAN
latex2exp	CRAN
bookdown	CRAN
rmarkdown	CRAN
brms	CRAN
broom	CRAN
kableExtra	CRAN
irtools	WFU-OIR
wfudata	WFU-OIR

Directory Structure

There is a common file structure for each project that takes the following form:

usual_suspects_project_structure %>% 
  knitr::kable(caption = "File Structure")

File Structure
item_name	item_type	item_purpose
data	directory	Store derived data
data-raw	directory	Store raw data
munge	directory	Cleaning scripts which create the cleaned data objects stored in the data folder
outputs	directory	This is where output figures and models should be placed
report	directory	This is where the reports should be stored
src	directory	This is where analysis scripts should be stored
libs	directory	This is to store any helper functions that you create during the analysis
makefile	makefile	A GNU makefile which builds the analysis
README.Md	markdown file	A README file that describes the project origin and important features

Again, with a sub-folder structure that looks like the following:

usual_suspects_subfolder_structure %>% 
  knitr::kable(caption = "File Structure")

File Structure
item_name	item_type	parent_folder	item_purpose
models	directory	outputs	Store derived models from the analysis
to-be-reviewed	directory	report	Store files for editing
released	directory	report	Store the released copies of the reports

Additionally, there are common files that have common functions:

usual_suspects_analysis_files %>% 
  knitr::kable(caption = "File Structure")

File Structure
item_name	parent_folder	item_purpose
01-import.R	munge	Import data from stakeholder and combines with the LSDS
02-match.R	munge	Runs a matching algorithm for causal inference if desired
01-run-analysis.R	src	Runs several regression analysis and saves the coefficients
02-clean-analysis.R	src	Cleans up the regression analysis for graphing and the final report

Making it Happen

Create A Project in a New Directory

First you will need to create a new Rproject using the usual methods. For the project name I typically write us-NAME-Of-Analysis. For example if I were working on a project for the Art department, I would name it us-arts. If I were working on something for the Call to Conversation, I would call it us-call-to-conversation. This way you can quickly see what project is using the usualsuspects template.

Make New Rmarkdown Document from Template

Once you have established that you can create a new Usual Subjects Project by first going to “File” -> “New R Markdown Document” -> “From Template” and choose the usual suspects template.

Create a new R Markdown document from template

Go ahead and rename the folder “report”.

Select usualsuspects template and rename the name field to report

This will generate a new folder with the associated template and supporting files for your usualsuspects report.

usualsuspects report template files

Generating the Remainder of the Usual Suspects Project Template

Then in the console write the following and then press enter:

library(usualsuspects)
make_us_project_templates()

This will build out the template for all other files.

Now Build Do Your Analysis

Import/ Munge

First, add your data from the requester into the data-raw folder. Use the 01-import.R script as a template and do whatever data importing and munging is required. If you need to perform some matching, please verify that 02-match.R writes out a file to disk.

Check the Analysis

Now check to make sure that the code is modeling all of the outcome parameters desired in the 01-run-analysis.R file.

Verify the Parameters in the RMarkdown Document

The usualsuspects template utilises a parameterised Rmarkdown report which allows you to set some variables that carry forward for the rest of the report. Below is an example of the parameters used in a report. It includes some information about the names for the treatment and control groups, the language to use when describing the differences between the two groups and some additional features.

---
title: "Review of _SOMETHING_"
subtitle: "Descriptive Statistics of Demographics"
author: "Michael DeWitt _Office of Institutional Research_"
date: "TODAY (Updated: 2019-07-30)"
toc: false
output: bookdown::pdf_document2
bibliography: my_bib.bib
params:
  treated_group_name: "Subscribed"
  control_group_name: "Did Not Subscribe"
  between_language: "those students whose parents/guardians subscribed to the _Daily Deac_ blog and those who did not"
  regression_analysis: "First Year"
  causal: FALSE
  clear_log: TRUE
  draft: FALSE
  nc_region: FALSE
  hdi_level: "95%"
  demographics: TRUE
  regression_analysis: TRUE
header-includes:
   - \usepackage{eso-pic,graphicx,transparent, float}
---

treated_group_name: The name to be used for the group of interest
control_group_name: The name to be used for the rest of the population that did not attend/ receive the intervention
between_language: this is the language that is used to describe the differences between the two groups
regression_analysis: note used, yet
causal: if true this will print a warning about assigning causality
clean_log: should be left as TRUE
draft: If TRUE then a DRAFT watermark will be printed on all pages
nc_region: If TRUE then students from NC will be broken out into a separate category when Census region of origin is examined
hdi_level: Highest Density Interval/ Confidence interval to be considered
demographics: TRUE/FALSE to complete the demographic analysis crosstabs
regression_analysis: TRUE/FALSE for if you print the regression analysis

Then `make`

After most of the code is written, go to the terminal and type make. This will begin to execute a series of operations that will run all of the R scripts in order, generate the Rmarkdown report as a pdf, sort the results into a file located in report/log_file.txt and then move a draft into the drafts folder.

You can continue to type make into the console whenever needed and only those files you have modified and their dependencies will be re-run.