“Education production” is a topic much explored by economists. The terminology reflects that we think of school environment features as inputs that cost money, while student learning is the output that schools produce. A major question in the field is which inputs have the highest benefit/cost ratio, and one very costly input is class size - most likely, you~ll need to hire more teachers to reduce class size. An important experiment conducted in Tennessee was designed to precisely answer the question “Does class size impacts student performance?”.
Krueger (1999) analyzed the Project STAR, a longitudinal study that randomly assigned kindergarten students and their teachers to one of three groups beginning in the 1985–1986 school year. The three groups were small classes (13–17 students per teacher), regular-size classes (22–25 students), and regular/aide classes (22–25 students) which also included a full-time teacher’s aide. After their initial assignment, the design called for students to remain in the same class type for four years. Around 6000–7000 students were involved in the project each year. You can find part of the sample related to students who entered STAR in kindergarten here to answer the following questions.
Create the dummy variables Free_lunch
(takes 1 if lunch
is “free”), White_asian
(equal 1 if ethnicity is either “cauc” or “asian”) and Female
- takes 1 if gender is “female”. Also, define the variable age
as 1986-birth
, i.e., compute the age of the children in 1986.
The first question to ask about a randomized experiment is whether the randomization successfully balanced the subject’s characteristics across different groups. Although the STAR data failed to include any pretreatment test scores, we can look at some characteristics of students such as race, gender, age, and free lunch status, which is a good measure of family income since only poor children qualify for free school lunch. Compare the values of Free_lunch
, White_asian
, Female
, and age
across the three groups small
, regular
, regular+aide
. Are those variables balanced?
One way to get the causal effect of interest is to run a regression of the outcome score
(in percentage points) on the treatment classtype
. Run a regression of score
on classtype
using robust standard errors and explain the results.
Hint: Use feols() from the fixest package setting se="hetero"
.
setwd("D:/OneDrive - University of Illinois - Urbana/Causal Inference/ECON 474 Spring 2022/HW/HW2")
library(tidyverse)
<-readRDS("STARkinder.RDS")
star
## Dummy Free lunch
$Free_lunch<-ifelse(star$lunch=="free",1,0)
star$White_asian<-ifelse(star$ethnicity=="cauc"|star$ethnicity=="asian",1,0)
star$Female<-ifelse(star$gender=="female",1,0)
star## Got an error using 1986-star$birth
$age=-(star$birth-1986) star
%>%group_by(classtype)%>%summarize(`Free lunch`=mean(Free_lunch),
star`White/Asian`=mean(White_asian),
Female=mean(Female),
Age=mean(age))
## # A tibble: 3 x 5
## classtype `Free lunch` `White/Asian` Female Age
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 regular 0.479 0.678 0.487 5.88
## 2 regular+aide 0.499 0.663 0.486 5.89
## 3 small 0.473 0.685 0.486 5.89
One option is to split the dataset into two: one with regular
and small
classes, and the other with regular
and regular+aide
classes. Then, it is straightforward to apply the function t.test()
.
<-star%>%filter(classtype=="regular"|classtype=="small")
reg_small
lapply(reg_small[,c('Free_lunch', 'White_asian', 'Female', 'age')],
function(x) t.test(x ~ reg_small$classtype))
## $Free_lunch
##
## Welch Two Sample t-test
##
## data: x by reg_small$classtype
## t = 0.36147, df = 3642.1, p-value = 0.7178
## alternative hypothesis: true difference in means between group regular and group small is not equal to 0
## 95 percent confidence interval:
## -0.02629397 0.03818111
## sample estimates:
## mean in group regular mean in group small
## 0.4785028 0.4725592
##
##
## $White_asian
##
## Welch Two Sample t-test
##
## data: x by reg_small$classtype
## t = -0.4469, df = 3646.9, p-value = 0.655
## alternative hypothesis: true difference in means between group regular and group small is not equal to 0
## 95 percent confidence interval:
## -0.03691611 0.02321084
## sample estimates:
## mean in group regular mean in group small
## 0.6783005 0.6851531
##
##
## $Female
##
## Welch Two Sample t-test
##
## data: x by reg_small$classtype
## t = 0.010441, df = 3641.5, p-value = 0.9917
## alternative hypothesis: true difference in means between group regular and group small is not equal to 0
## 95 percent confidence interval:
## -0.03209383 0.03243747
## sample estimates:
## mean in group regular mean in group small
## 0.4865959 0.4864240
##
##
## $age
##
## Welch Two Sample t-test
##
## data: x by reg_small$classtype
## t = -0.72547, df = 3637.4, p-value = 0.4682
## alternative hypothesis: true difference in means between group regular and group small is not equal to 0
## 95 percent confidence interval:
## -0.03062459 0.01408210
## sample estimates:
## mean in group regular mean in group small
## 5.884421 5.892692
<-star%>%filter(classtype=="regular"|classtype=="regular+aide")
reg_regaide
lapply(reg_regaide[,c('Free_lunch', 'White_asian', 'Female', 'age')],
function(x) t.test(x ~ reg_regaide$classtype))
## $Free_lunch
##
## Welch Two Sample t-test
##
## data: x by reg_regaide$classtype
## t = -1.3114, df = 3988.7, p-value = 0.1898
## alternative hypothesis: true difference in means between group regular and group regular+aide is not equal to 0
## 95 percent confidence interval:
## -0.05177769 0.01027209
## sample estimates:
## mean in group regular mean in group regular+aide
## 0.4785028 0.4992556
##
##
## $White_asian
##
## Welch Two Sample t-test
##
## data: x by reg_regaide$classtype
## t = 1.0597, df = 3989.8, p-value = 0.2893
## alternative hypothesis: true difference in means between group regular and group regular+aide is not equal to 0
## 95 percent confidence interval:
## -0.01340565 0.04494453
## sample estimates:
## mean in group regular mean in group regular+aide
## 0.6783005 0.6625310
##
##
## $Female
##
## Welch Two Sample t-test
##
## data: x by reg_regaide$classtype
## t = 0.015386, df = 3988.6, p-value = 0.9877
## alternative hypothesis: true difference in means between group regular and group regular+aide is not equal to 0
## 95 percent confidence interval:
## -0.03078453 0.03127152
## sample estimates:
## mean in group regular mean in group regular+aide
## 0.4865959 0.4863524
##
##
## $age
##
## Welch Two Sample t-test
##
## data: x by reg_regaide$classtype
## t = -0.4695, df = 3989.5, p-value = 0.6387
## alternative hypothesis: true difference in means between group regular and group regular+aide is not equal to 0
## 95 percent confidence interval:
## -0.02669355 0.01637890
## sample estimates:
## mean in group regular mean in group regular+aide
## 5.884421 5.889578
The other option is to perform ANOVA and compare all the three means together:
lapply(star[,c('Free_lunch', 'White_asian', 'Female', 'age')], function(x) anova(lm(x ~ star$classtype)))
## $Free_lunch
## Analysis of Variance Table
##
## Response: x
## Df Sum Sq Mean Sq F value Pr(>F)
## star$classtype 2 0.76 0.37764 1.5121 0.2205
## Residuals 5720 1428.53 0.24974
##
## $White_asian
## Analysis of Variance Table
##
## Response: x
## Df Sum Sq Mean Sq F value Pr(>F)
## star$classtype 2 0.51 0.25654 1.1689 0.3108
## Residuals 5720 1255.33 0.21946
##
## $Female
## Analysis of Variance Table
##
## Response: x
## Df Sum Sq Mean Sq F value Pr(>F)
## star$classtype 2 0.0 0.000031 1e-04 0.9999
## Residuals 5720 1429.7 0.249948
##
## $age
## Analysis of Variance Table
##
## Response: x
## Df Sum Sq Mean Sq F value Pr(>F)
## star$classtype 2 0.07 0.032667 0.2712 0.7624
## Residuals 5720 688.90 0.120437
These tests lead to the same result: the characteristics of students are balanced across treatment/control groups.
library(fixest)
<-feols(score~classtype, se="hetero", data=star)
reg1etable(reg1)
## reg1
## Dependent Var.: score
##
## (Intercept) 51.44*** (0.6182)
## classtyperegular+aide -0.1900 (0.8491)
## classtypesmall 4.835*** (0.9046)
## _____________________ _________________
## S.E. type Heteroskeda.-rob.
## Observations 5,723
## R2 0.00699
## Adj. R2 0.00665
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The STAR experiment shows that attending a small class raises test scores by about 4.8 percentage points. Also, students assigned to regular+aide
classes did not experience any improvement in scores.
Schools with at least three classes in each grade could choose to enroll in the experiment. Hence, students were randomized but not schools, and that might be a source of bias. One way to wash out the bias is to control for schoolid
. Run a regression of the outcome score
(in percentage points) on the treatment classtype
, controlling for schoolid
and using robust standard errors. How do these results compare to what you got in (c)?
Run a regression of the outcome score
(in percentage points) on the treatment classtype
, controlling for schoolid
, Free_lunch
, White_asian
, female
, and experience
. Use robust standard errors. Did the estimates related to small
and regular+aide
change when you added covariates? Why/Why not?
Do your new results represent the causal effect of class size on students’ scores? Why/Why not?
The coefficient on classtype small
is now bigger, and the s.e. is now smaller.
<-feols(score~classtype, se="hetero", data=star)
reg1<-feols(score~classtype|schoolid, se="hetero", data=star)
reg2etable(reg1, reg2, signifCode = c("***"=0.01, "**"=0.05, "*"=0.10))
## reg1 reg2
## Dependent Var.: score score
##
## (Intercept) 51.44*** (0.6182)
## classtyperegular+aide -0.1900 (0.8491) 0.2939 (0.7444)
## classtypesmall 4.835*** (0.9046) 5.593*** (0.8046)
## Fixed-Effects: ----------------- -----------------
## schoolid No Yes
## _____________________ _________________ _________________
## S.E. type Heteroskeda.-rob. Heteroskeda.-rob.
## Observations 5,723 5,723
## R2 0.00699 0.25329
## Within R2 -- 0.01103
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficient on classtype small
changed just a bit, and that was expected since the treatment was randomly assigned.
<-feols(score~classtype+Free_lunch+White_asian+Female+experience|schoolid, se="hetero", data=star)
reg3
etable(reg2, reg3, signifCode = c("***"=0.01, "**"=0.05, "*"=0.10))
## reg2 reg3
## Dependent Var.: score score
##
## classtyperegular+aide 0.2939 (0.7444) 0.2069 (0.7104)
## classtypesmall 5.593*** (0.8046) 5.505*** (0.7674)
## Free_lunch -13.23*** (0.7411)
## White_asian 9.480*** (1.279)
## Female 4.732*** (0.6025)
## experience 0.2536*** (0.0588)
## Fixed-Effects: ----------------- ------------------
## schoolid Yes Yes
## _____________________ _________________ __________________
## S.E. type Heteroskeda.-rob. Heteroskedas.-rob.
## Observations 5,723 5,723
## R2 0.25329 0.31807
## Within R2 0.01103 0.09682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The main assumption for causal effects of class size on students’ scores in this framework is conditional independence, which is credible. We know that the researchers randomly assigned students to different sizes of classes. Although the schools were not randomly selected, controlling for school id is enough for an “as good as random” treatment.
The standard economic theory of criminal behavior frames the engagement in illicit activities as a result of a rational choice, a decision that considers costs and benefits. The straightforward prediction of this theory is that people respond to changes in the expected costs and benefits of committing a crime. For instance, if there are more police on the streets or better police intelligence, that would increase the probability of arrest \((P_{A})\) faced by potential offenders. The increase in \(P_{A}\) generates a drop in crime rates since it increases the cost of committing a crime/decreases the expected utility of crime. This behavioral response of individuals is known as a deterrent effect.
In this question, we’ll use part of the dataset from Cornwell and Trumbuil (1993). Download the .RDS
file here. The authors have data from counties in North Carolina from 1981 to 1987, and their goal was to estimate by how much crime would go down with an increase in the probability of arrest \((P_{A})\), i.e., the elasticity of crime with respect to \(P_{A}\).
CT1993.RDS
and get the average crime rate crmte
and the average probability of arrest prbarr
for each year from 1981 to 1987. Is there any clear pattern between those variables over time across North Carolina’s counties?Hint: you want to gropu_by(year) and summarize() the data.
Filter the data by year==1987
. After that, construct a scatterplot with lcrmrte
(log crime per person) in the y-axis and prbarr
(probability of arrest) in the x-axis. How do those two variables relate?
Estimate the following model by OLS with robust s.e. using the 1987 data:
\[lcrmrte_{i}=\beta_{0}+\beta_{1}prbarr_{i}+v_{i}\] and interpret the results.
\[lcrmrte_{i}=\gamma_{0}+\gamma_{1}prbarr_{i}+\gamma_{2}density_{i}+\eta_{i}\] where density
is the population density (number of people per sq. mile). Why are the coefficients related to prbarr
different in the two regressions?
\[density_{i}=\theta_{0}+\theta_{1}prbarr_{i}+\varepsilon_{i}\] and show how \(\theta_{1}\) relates to \(\beta_{1}\) and \(\gamma_{1}\).
Although higher values of Probability of Arrest
appear to be followed by slightly lower Crime Rate
values, there is no strong pattern between those variables over time.
<-readRDS("CT1993.RDS")
ct93
library(tidyverse)
<-ct93%>%
avggroup_by(year)%>%
summarise(`Crime Rate`=mean(crmrte),
`Probability of Arrest`=mean(prbarr))
library(kableExtra)
kbl(avg) %>%
kable_paper(full_width = F)%>%
kable_styling(latex_options = "striped", position = "center", font_size = 20)
year | Crime Rate | Probability of Arrest |
---|---|---|
81 | 0.0327502 | 0.2990574 |
82 | 0.0326482 | 0.3115709 |
83 | 0.0307077 | 0.3281419 |
84 | 0.0294517 | 0.3178034 |
85 | 0.0297594 | 0.3040093 |
86 | 0.0322861 | 0.2957571 |
87 | 0.0335099 | 0.2952375 |
Crime rates between North Carolina counties in 1987 appear to be negatively correlated with probability of arrest.
<-ct93%>%filter(year==87)
crime87
library(ggthemes)
ggplot(crime87, aes(x=prbarr, y=lcrmrte)) + geom_point(color="#91b8bd") +
stat_smooth(method = "lm", formula =y~x, se=F, color="#336666") +
scale_x_continuous(name = "Probability of Arrest") +
scale_y_continuous(name = "Log Crime Rates") +
theme_economist(base_size = 17)+
theme(axis.text=element_text(size=12),
axis.title=element_text(size=12,face="bold"))
The interpretation here is tricky. Note that prbarr
is a ratio, and the dependent variable is in log
. Hence, if the probability of arrest increases by 0.01, the crime rate per person decreases by 1.188%.
The probability of arrest increasing by 0.01 does not mean a 1% increase in the probability of arrest. To interpret results as elasticities, take the log() of prbarr and run the regression lcrmrte
on lprbarr
(check results of reg2
).
<-feols(lcrmrte~prbarr,se="hetero", data=crime87)
reg1<-feols(lcrmrte~log(prbarr),se="hetero", data=crime87)
reg2etable(reg1,reg2, signifCode = c("***"=0.01, "**"=0.05, "*"=0.10))
## reg1 reg2
## Dependent Var.: lcrmrte lcrmrte
##
## (Intercept) -2.985*** (0.1200) -4.316*** (0.1855)
## prbarr -1.884*** (0.3015)
## log(prbarr) -0.5936*** (0.1460)
## _______________ __________________ ___________________
## S.E. type Heteroskedas.-rob. Heteroskedast.-rob.
## Observations 90 90
## R2 0.22351 0.18988
## Adj. R2 0.21468 0.18068
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients are different because you added other covariate - prbarr
is correlated with population density, and population density has a bite on crime rates.
<-feols(lcrmrte~prbarr+density,se="hetero", data=crime87)
reg2etable(reg1,reg2, signifCode = c("***"=0.01, "**"=0.05, "*"=0.10))
## reg1 reg2
## Dependent Var.: lcrmrte lcrmrte
##
## (Intercept) -2.985*** (0.1200) -3.455*** (0.1417)
## prbarr -1.884*** (0.3015) -1.239*** (0.3517)
## density 0.1944*** (0.0247)
## _______________ __________________ __________________
## S.E. type Heteroskedas.-rob. Heteroskedas.-rob.
## Observations 90 90
## R2 0.22351 0.48721
## Adj. R2 0.21468 0.47543
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
\(\beta_{1}=\gamma_{1}+\text{bias}=\gamma_{1}+ \gamma_{2}\times\theta_{1} \Rightarrow-1.884=-1.239+ 0.1944\times-3.317\)
<-feols(density~prbarr,se="hetero", data=crime87)
reg3etable(reg1,reg2,reg3, signifCode = c("***"=0.01, "**"=0.05, "*"=0.10))
## reg1 reg2 reg3
## Dependent Var.: lcrmrte lcrmrte density
##
## (Intercept) -2.985*** (0.1200) -3.455*** (0.1417) 2.417*** (0.4417)
## prbarr -1.884*** (0.3015) -1.239*** (0.3517) -3.317** (1.156)
## density 0.1944*** (0.0247)
## _______________ __________________ __________________ _________________
## S.E. type Heteroskedas.-rob. Heteroskedas.-rob. Heteroskeda.-rob.
## Observations 90 90 90
## R2 0.22351 0.48721 0.09032
## Adj. R2 0.21468 0.47543 0.07998
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
No! First, there are many other missing variables in both regressions - covariates that are correlated with the probability of arrest and also influence the outcome crime rates. Also, one needs to deal with the fact that crime rates might also affect the probability of arrest (the simultaneity problem).
In Brazil, there is a growing interest in the effect of studying in a Military School versus going to a regular public school. To attribute differences in student achievement to the type of school attended is complicated because students are not randomly allocated to Military or other public institutions. It is important to mention that there are two ways to enroll in a Military school: the student can either take a highly competitive entrance exam or attend a Military school for having a parent in the Army.
The dataset here1 contains (simulated) information about 3,704 students who attended different schools in Fortaleza-CE. At the end of Middle school, students from various institutions take a standardized exam. Your job is to get the causal effect of Military school attendance on students’ scores.
The variables are described below:
Variable | Definition |
---|---|
id | Students’s Identification |
test_score | Standardized Test Scores |
military | 1 if attended a Military school, 0 otherwise |
white | 1 if student is white, 0 otherwise |
income | Student’s family income |
mothers_edu | 1 if mother’s education level is high school or below, 0 if some college or above |
mothers_age | Age of mother |
test_score
between the treatment (Military school attendance) and control groups (other public schools)? Is that difference statistically significant?Hint: use the function read_csv()
from tidyverse
to import the dataset.
Run t-tests
to check for balance between treatment and control groups for all the covariates income
, white
, mothers_edu
, and mothers_age
. Are those variables balanced?
Based on your answer for b), can you guess the direction of the bias? i.e., do you think that the naive comparison using non-experimental data underestimates or overestimates the causal effect of Military school attendance? Explain.
The naive comparison gives an average treatment effect of 0.1584, i.e., the average score of students who went to military schools (0.299) is higher than the average test score of students who attended regular public schools (0.141). That difference is statistically significant (p-value<0.01).
<-read_csv("military.csv")
data
%>%group_by(military)%>%summarize(outcome=mean(test_score), income=mean(income), white=mean(white), mothers_educ=mean(mothers_edu),mothres_age=mean(mothers_age), obs=n()) data
## # A tibble: 2 x 7
## military outcome income white mothers_educ mothres_age obs
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 0 0.141 77790. 0.760 0.192 39.4 2920
## 2 1 0.299 89669. 0.796 0.153 40.2 784
t.test(test_score~military, data=data)
##
## Welch Two Sample t-test
##
## data: test_score by military
## t = -4.3489, df = 1304.9, p-value = 1.475e-05
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.22988578 -0.08695741
## sample estimates:
## mean in group 0 mean in group 1
## 0.1410667 0.2994883
The socioeconomic/demographic variables are not balanced between groups. On average, the control group has mothers with more education years than the treatment group. On the other hand, age, the share of whites, and income are higher in the treatment group.
lapply(data[,c('income', 'white', 'mothers_edu', 'mothers_age')],
function(x) t.test(x ~ data$military))
## $income
##
## Welch Two Sample t-test
##
## data: x by data$military
## t = -6.814, df = 1221.9, p-value = 1.487e-11
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -15299.218 -8458.764
## sample estimates:
## mean in group 0 mean in group 1
## 77789.90 89668.89
##
##
## $white
##
## Welch Two Sample t-test
##
## data: x by data$military
## t = -2.1903, df = 1294.4, p-value = 0.02868
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.068219396 -0.003754325
## sample estimates:
## mean in group 0 mean in group 1
## 0.7599315 0.7959184
##
##
## $mothers_edu
##
## Welch Two Sample t-test
##
## data: x by data$military
## t = 2.6185, df = 1329.1, p-value = 0.008933
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.00971084 0.06772835
## sample estimates:
## mean in group 0 mean in group 1
## 0.1917808 0.1530612
##
##
## $mothers_age
##
## Welch Two Sample t-test
##
## data: x by data$military
## t = -3.9161, df = 1431, p-value = 9.424e-05
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -1.0972422 -0.3648559
## sample estimates:
## mean in group 0 mean in group 1
## 39.42329 40.15434
I would guess that the naive comparison overestimates the true causal effect of Military school attendance on test scores. Students who perform well on the entrance exam are the ones being treated in the first place, and they are more likely to do better in school anyway. Also, the treatment group has a much higher household income than the control.
income
, white
, mothers_edu
, and mothers_age
beside the treatment variable military
. Is the effect of Military school attendance different using regression and using a simple difference in means? Why?Yes! The naive comparison does not consider important differences in the characteristics of students in the treatment and control groups. When you control for all the variables available, you are (most likely) one step closer to an apples-to-apples comparison: the treatment effect decreases by more than 50%.
<-feols(test_score~military, se="hetero",data=data)
reg1<-feols(test_score~military+white+income+mothers_edu+mothers_age, se="hetero",data=data)
reg2
etable(reg1,reg2, signifCode = c("***"=0.01, "**"=0.05))
## reg1 reg2
## Dependent Var.: test_score test_score
##
## (Intercept) 0.1411*** (0.0177) -0.9398*** (0.1216)
## military 0.1584*** (0.0364) 0.0734* (0.0354)
## white 0.2980*** (0.0367)
## income 4.32e-6*** (3.58e-7)
## mothers_edu -0.3154*** (0.0382)
## mothers_age 0.0147*** (0.0030)
## _______________ __________________ ____________________
## S.E. type Heteroskedas.-rob. Heteroskedasti.-rob.
## Observations 3,704 3,704
## R2 0.00471 0.09276
## Adj. R2 0.00444 0.09154
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Run the logit model to get the predicted probabilities of getting treated given the observed covariates. Plot the distributions of the propensity score in the treated and control groups. What do you see? Can we rely on the existence of common support?
Restrict the sample finding the nearest neighbor of each treated student. Then, run two regressions (one with and the other without covariates) with the matched data. Contrast the results you got from the naive comparison and regression with the full sample. What assumptions do you rely on to claim that this is the true causal effect of Military school attendance on scores?
There is a reasonable overlap between distributions of propensity scores in control and treatment groups, so the lack of common support would not be an issue in this case.
<- glm( military ~ income+white+mothers_edu+mothers_age, data = data,
logit_reg family = binomial(link = 'logit'))
<- data.frame(prop_score = predict(logit_reg, type = "response"),
probs review = logit_reg$model$military)
library(patchwork)
<-probs %>%
hist1filter(review == 0) %>%
ggplot() +
geom_histogram(aes(x = prop_score))+
xlab("Prob. of going to Military School: control")+
theme_bw()
<-probs %>%
hist2filter(review== 1) %>%
ggplot() +
geom_histogram(aes(x = prop_score))+
xlab("Prob. of going to Military School: treated")+
theme_bw()
+hist2 hist1
Using MatchIt
to find the nearest neighbor for each treated student, you get a dataset with 1568 observations - one student in the control group for each of those 784 students in the treatment group.
library(MatchIt)
<- matchit(military ~ income+white+mothers_edu+mothers_age,
matching data = data, method = "nearest",
distance = "logit")
<- match.data(matching)
m_data dim(m_data)
## [1] 1568 10
The matched sample is now balanced in all variables:
%>%group_by(military)%>%summarize(outcome=mean(test_score), income=mean(income), white=mean(white), mothers_educ=mean(mothers_edu),mothres_age=mean(mothers_age), obs=n()) m_data
## # A tibble: 2 x 7
## military outcome income white mothers_educ mothres_age obs
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 0 0.219 89615. 0.796 0.147 40.1 784
## 2 1 0.299 89669. 0.796 0.153 40.2 784
Regression results with and without controls are similar. To claim the causal effect of Military school attendance on scores, one needs to assume conditional independence: the treatment is “as good as random” after you condition it on the variables you observe - income, race, mother’s education, and age.
<-feols(test_score~military, se="hetero",data=m_data)
reg1_NE<-feols(test_score~military+white+income+mothers_edu+mothers_age, se="hetero",data=m_data)
reg2_NE
etable(reg1_NE,reg2_NE, signifCode = c("***"=0.01, "**"=0.05))
## reg1_NE reg2_NE
## Dependent Var.: test_score test_score
##
## (Intercept) 0.2188*** (0.0328) -1.084*** (0.2059)
## military 0.0806. (0.0457) 0.0814. (0.0440)
## white 0.2368*** (0.0563)
## income 3.67e-6*** (5.28e-7)
## mothers_edu -0.3327*** (0.0565)
## mothers_age 0.0208*** (0.0049)
## _______________ __________________ ____________________
## S.E. type Heteroskedas.-rob. Heteroskedasti.-rob.
## Observations 1,568 1,568
## R2 0.00198 0.07748
## Adj. R2 0.00134 0.07452
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Trim the data based on the propensity score dropping observations with p-score<0.1
and p-score>0.9
. Then run the weighted regression using the IPW as weights. Contrast the results you got with the ones from the nearest neighbor.
Is any of those regressions likely to provide the true causal effect of Military school attendance on student scores? Why/Why not?
Results are similar using IPW with and without controls. Compared to the ones we got using Nearest-neighbor matching, the treatment effect appears to be slightly bigger when weighing the propensity score. On average, test scores in the treatment group are 0.08-0.09 higher than the control group scores.
$prop_scores<-logit_reg$fitted.values
data<-data%>%filter(prop_scores>=.1 &prop_scores<=.9 ) #### there are no prop_scores <0.1 and >0.9
data## Generating the IPW
<-data %>%
datamutate(ipw = (military / prop_scores) + ((1 - military) / (1 - prop_scores)))
<- feols(test_score ~ military, data = data, se="hetero", weights = data$ipw)
reg1_IPW <- feols(test_score ~ military+income+white+mothers_edu+mothers_age, data =data, se="hetero", weights = data$ipw)
reg2_IPW etable(reg2_NE, reg1_IPW,reg2_IPW, signifCode = c("***"=0.01, "**"=0.05, "*"=0.10))
## reg2_NE reg1_IPW reg2_IPW
## Dependent Var.: test_score test_score test_score
##
## (Intercept) -1.084*** (0.2059) 0.1607*** (0.0177) -1.071*** (0.1496)
## military 0.0814. (0.0440) 0.0928* (0.0370) 0.0832* (0.0356)
## white 0.2368*** (0.0563) 0.2915*** (0.0447)
## income 3.67e-6*** (5.28e-7) 3.62e-6*** (4.06e-7)
## mothers_edu -0.3327*** (0.0565) -0.3339*** (0.0436)
## mothers_age 0.0208*** (0.0049) 0.0196*** (0.0036)
## _______________ ____________________ __________________ ____________________
## S.E. type Heteroskedasti.-rob. Heteroskedas.-rob. Heteroskedasti.-rob.
## Observations 1,568 3,704 3,704
## R2 0.07748 0.00253 0.08421
## Adj. R2 0.07452 0.00226 0.08297
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Probably not. To claim causality, one needs to defend the conditional independence in this setting. However, it is hard to ignore important student characteristics not observed in this dataset, such as ability and motivation.
Disclaimer: this is a simulated dataset. The purpose of this question is to improve your understanding of selection on observables. We cannot draw any conclusions about the effect of Military Schools based on this example.↩︎