PS05: Linear Regression — One Categorical Predictor
Overview
Practice linear regression with a single categorical predictor — including visualisation, model fitting, interpretation of intercepts and dummy variable coefficients, and residual analysis — using data on hate crimes recorded across US states in the 10 days after the 2016 presidential election.
For context, read the FiveThirtyEight article Higher Rates of Hate Crimes Are Tied to Income Inequality before attempting this problem set.
Read Chapter 6 of ModernDive before attempting this problem set.
Download
Download the problem set template, open it in RStudio, and complete the exercises directly in the document.
Setup
Run this at the top of your document to install and load the required packages:
if (!require(pacman)) install.packages("pacman")
pacman::p_load(ggplot2, dplyr, moderndive, readr)Exercises
Part 1: Hate crimes and Trump support
Model hate crimes using a three-level categorical predictor.
Part 2: Hate crimes and unemployment
Model hate crimes using a two-level categorical predictor.
Part 3: Hate crimes and household income
Model hate crimes using median household income as a categorical predictor.
Saving your plots
Save any plots you create to the figures/ folder using ggsave(). Use descriptive file names that reflect the content of the plot:
trump_plot <- ggplot(data = hate_crimes, aes(x = trump_support, y = hate_crimes)) +
geom_boxplot()
ggsave("figures/hate-crimes-by-trump-support.png", plot = trump_plot,
width = 16/2, height = 9/2)When you are done, render to HTML and submit on Moodle. Name your file PS05_yourname.html.