In this question, you will work with data from Card and Krueger (1994). David Card and Alan Krueger collected information about fast-food restaurants in New Jersey (NA) and eastern Pennsylvania (PA) during two rounds of interviews. The first interview wave was in March 1992, just before New Jersey raised its minimum wage from 4.25 to 5.05 dollars - the raise occurred in April 1992. The second round of interviews was in November and December of 1992, 7 months after the new minimum wage policy in New Jersey. During that period, the minimum wage in Pennsylvania remained at the federal level of $4.25. The authors evaluated the impact of the new minimum wage law by analyzing this data.
Some columns have the same name but end with “2”. That means those values correspond to the second round of interviews (after the minimum wage raise).
Variable | Definition |
---|---|
sheet | Restaurant id |
chain | 1 if Burger King, 2 if KFC, 3 if Roy Rogers, and 4 if Wendy’s |
co_owned | 1 if restaurant is company-owned |
state | 1 if New Jersey, 0 if Pennsylvania |
wage_st | Starting wage (Dollar/hour) in March |
wage_st2 | Starting wage (Dollar/hour) in Nov/Dec |
fte | Full time equivalent employment in March |
fte2 | Full time equivalent employment in Nov/Dec |
Download the dataset here and answer the following:
How many Burger King, KFC, Roy Rogers, Wendy’s, and company-owned stores were located in NJ and PA?
Calculate four averages: the average starting wages (wage_st
) in NJ and PA before and after the new minimum wage law in NJ. What is the difference-in-differences estimate of the impact of minimum wage on fast food starting wage in NJ?
Repeat the same exercise as in (b) for full-time equivalent employment (fte
). What is the impact of the new minimum wage law on employment in NJ fast-food restaurants?
Download the panel data version of this dataset here. Then, run the following regression DiD model
\[Y_{ist}=\alpha+\beta Dint_{st} + \lambda POST_{t} + \gamma TREAT_{s}+\varepsilon_{st}\] using wages and full time equivalent employment. How do those estimates compare to the results in (b) and (c)?
Hint: you need to create the variables Dint, POST, and TREAT.
A nice feature of regression DiD is that you can control for other factors. For instance, you might want to add covariates such as chains (chain
) and a dummy to capture whether the restaurant is company-owned (co_owned
). Repeat the models you used in (d), adding those covariates. Do your results change too much when adding restaurant-specific covariates? Do you have an explanation for that?
An alternative to comparing NJ and PA restaurants is to use restaurants within NJ with high and low wages before the minimum wage increase. Restrict the sample to NJ and identify restaurants paying salaries above and below 5 dollars/hour (i.e., create a dummy that takes on 1 if the restaurant pays salary below 5 dollars). Then, compare employment and wages before and after the new minimum wage law between restaurants above and below the $5 threshold. What is the relative impact of the minimum wage on employment within NJ? How do the within NJ estimates compare to those obtained in part (d)?
Hint: Use the first dataset for (f) and (g)
Common sense dictates that more police on the streets reduce criminal activity. It is also a prediction from the standard model of the economics of crime, whenever that increase in policing reflects a higher probability of apprehension perceived by offenders (Becker 1968). Nevertheless, to establish a causal effect between those two, you need to break the circle: more crime leads to more police on the streets. One way to do it is through natural experiments.
Draca, Machin and Witt (2011) investigated the police intervention that emerged after the terrorist attacks on London in July 2005 called “Operation Theseus.” In the following six weeks after the attacks, some boroughs in central London experienced a 34% increase in policing. Throughout the period policing was intensified, the number of crimes dropped significantly in these areas under intense monitoring. Their results indicate that an increase of 10% of the police presence has reduced crimes by 3 to 4%, on average.
Download part of their data here and answer the following:
Variable | Definition |
---|---|
ocu | Borough id |
week | Weeks from January 1, 2004, to December 31, 2005 |
tot | Total number of crimes |
h | Total policing hours |
pop | Borough’s population |
crimepop | Total number of crimes per 1,000 people |
hpop | Policing hours per 1,000 people |
To answer most of the questions, you will work with a subset of the available data. In particular, the focus is on the weeks between July 8, 2004 - August 19, 2004 (one year before the terrorist attack/pre-treatment period) and July 7, 2005 - August 18, 2005 (treatment period). Filter the data for weeks>=28 & weeks<=33
(before) and weeks>=80 & weeks<=85
(after).
Define the treatment and control groups creating the dummy treatment
that takes on 1 if ocu==1|ocu==2|ocu==3|ocu==6|ocu==14
- these are the treated boroughs located in central London - and zero otherwise. Also, define the dummy post
that takes on 1 if weeks>=80
and zero otherwise.
What is the difference-in-differences estimate of the impact of the terrorist attack on policing levels (hpop
)? What is the difference-in-differences estimate of the effects of the terrorist attack on total crime (crimepop
)?
Using the full dataset, plot the evolution of average policing hours per 1,000 people in treated (treatment==1
) and the control (treatment==0
) boroughs from January 2004 to December 2005. Do the same for average crime per 1,000 people. What can you say about the common trends assumption in this setting?
Hint: average out hpop
and crimepop
using group_by
and summarize()
. You want to group_by()
treatment status and also week.
\[\text{log(crimepop)}_{it}= \beta \text{Dint}_{it}+ Week_{t}+Borough_{i}+ \varepsilon_{it}\]
\[\text{log(hpop)}_{it}= \gamma \text{Dint}_{it}+ Week_{t}+Borough_{i}+ \varepsilon_{it}\]
where Dint
is the interaction between treatment
(treated boroughs) and post
(the treatment period).
Time to calculate the police-crime elasticity \(\eta\) and interpret the results. \(\beta\) represents the percentage average decrease in crimes, and \(\gamma\) represents the percentage average increase in policing hours during the high-alert six weeks in London. Get \(\eta\) by diving \(\frac{\beta}{\gamma}\). Then, interpret this result.
Can you think about any issue that could affect the identification of the causal effect of police on crime in their setting?