Rosenbaum and Rubin showed in observational studies, conditioning on propensity scores can lead to unbiased estimates of the exposure effect
\[Pr(exposure = 1) = \frac{1}{1+\exp(-X\beta)}\]
Photo by Anna CC-BY-SA-4.0
Historically, guests who stayed in a Walt Disney World resort hotel were able to access the park during “Extra Magic Hours” during which the park was closed to all other guests.
These extra hours could be in the morning or evening.
The Seven Dwarfs Mine Train is a ride at Walt Disney World’s Magic Kingdom. Typically, each day Magic Kingdom may or may not be selected to have these “Extra Magic Hours”.
We are interested in examining the relationship between whether there were “Extra Magic Hours” in the morning and the average wait time for the Seven Dwarfs Mine Train the same day between 9am and 10am.
Application Exercise
10:00
appex-03.qmd
park_extra_magic_morning
Matching
Weighting
Stratification
Direct Adjustment
…
Greifer, N., & Stuart, E. A. (2021). Choosing the estimand when matching or weighting in observational studies. arXiv preprint arXiv:2106.10577. See also Choosing Estimands in our book.
Matching
Weighting
Stratification
Direct Adjustment
…
\[\tau = E[Y(1) - Y(0)]\]
\[\tau = E[Y(1) - Y(0) | X = 1]\]
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score
- estimated with logistic regression
- number of obs.: 1566 (original), 806 (matched)
- target estimand: ATT
- covariates: sex, race, age, I(age^2), education, smokeintensity, I(smokeintensity^2), smokeyrs, I(smokeyrs^2), exercise, active, wt71, I(wt71^2)
Rows: 806
Columns: 71
$ i <chr> "11", "1220", "15", "1082", "18"…
$ subclass <fct> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6,…
$ weights <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ seqn <dbl> 428, 23045, 446, 22294, 596, 140…
$ qsmk <dbl> 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,…
$ death <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,…
$ yrdth <dbl> NA, NA, 88, NA, NA, NA, NA, NA, …
$ modth <dbl> NA, NA, 1, NA, NA, NA, NA, NA, N…
$ dadth <dbl> NA, NA, 3, NA, NA, NA, NA, NA, N…
$ sbp <dbl> 135, 159, 141, 113, 151, NA, 125…
$ dbp <dbl> 89, 91, 79, 73, 80, NA, 71, 85, …
$ sex <fct> 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0,…
$ age <dbl> 43, 49, 71, 36, 48, 51, 56, 40, …
$ race <fct> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ income <dbl> 19, 22, 17, 21, 18, 22, 20, 18, …
$ marital <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
$ school <dbl> 12, 12, 0, 12, 12, 9, 12, 10, 17…
$ education <fct> 3, 3, 1, 3, 3, 2, 3, 2, 5, 5, 3,…
$ ht <dbl> 176.5938, 160.2812, 147.0938, 17…
$ wt71 <dbl> 63.96, 47.29, 75.64, 68.38, 62.0…
$ wt82 <dbl> 79.83226, 53.07031, 56.69905, 73…
$ wt82_71 <dbl> 15.8722571, 5.7803073, -18.94095…
$ birthplace <dbl> 42, 30, NA, 19, 36, 47, NA, 42, …
$ smokeintensity <dbl> 30, 20, 40, 9, 2, 5, 20, 5, 30, …
$ smkintensity82_71 <dbl> -30, 0, -40, 1, -2, 1, -20, 0, -…
$ smokeyrs <dbl> 24, 29, 41, 30, 30, 29, 11, 20, …
$ asthma <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ bronch <dbl> 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,…
$ tb <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ hf <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbp <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 1,…
$ pepticulcer <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ colitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hepatitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ chroniccough <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,…
$ hayfever <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ diabetes <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ polio <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ tumor <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nervousbreak <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ alcoholpy <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,…
$ alcoholfreq <dbl> 3, 0, 4, 0, 1, 2, 3, 1, 3, 0, 2,…
$ alcoholtype <dbl> 3, 3, 4, 1, 2, 3, 4, 1, 2, 3, 3,…
$ alcoholhowmuch <dbl> 2, 2, NA, 6, 1, 3, NA, 5, 1, 3, …
$ pica <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ headache <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,…
$ otherpain <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ weakheart <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ allergies <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nerves <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ lackpep <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbpmed <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 1,…
$ boweltrouble <dbl> 0, 2, 1, 2, 0, 0, 0, 0, 0, 2, 0,…
$ wtloss <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ infection <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ active <fct> 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 2,…
$ exercise <fct> 1, 1, 1, 0, 1, 0, 2, 0, 0, 2, 2,…
$ birthcontrol <dbl> 2, 0, 0, 2, 0, 2, 0, 0, 2, 0, 2,…
$ pregnancies <dbl> NA, 4, 15, NA, 3, NA, 4, 2, NA, …
$ cholesterol <dbl> 173, 279, 229, 200, 225, 199, 23…
$ hightax82 <dbl> 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, …
$ price71 <dbl> 2.346680, 2.104980, NA, 2.199707…
$ price82 <dbl> 1.797363, 1.698242, NA, 1.847900…
$ tax71 <dbl> 1.3649902, 1.0498047, NA, 1.1022…
$ tax82 <dbl> 0.5718994, 0.4399414, NA, 0.5718…
$ price71_82 <dbl> 0.54931641, 0.40686035, NA, 0.35…
$ tax71_82 <dbl> 0.7929688, 0.6099854, NA, 0.5303…
$ id <int> 11, 1274, 15, 1135, 18, 564, 23,…
$ censored <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ older <dbl> 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
$ distance <dbl> 0.2384597, 0.2381918, 0.2090935,…
\[\tau = E[Y(1) - Y(0) | X = 0]\]
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score
- estimated with logistic regression
- number of obs.: 1566 (original), 806 (matched)
- target estimand: ATC
- covariates: sex, race, age, I(age^2), education, smokeintensity, I(smokeintensity^2), smokeyrs, I(smokeyrs^2), exercise, active, wt71, I(wt71^2)
Observations with propensity scores (on the linear logit scale) within 0.1 standard errors (the caliper) will be discarded
A matchit object
- method: 1:1 nearest neighbor matching without replacement
- distance: Propensity score [caliper]
- estimated with logistic regression and linearized
- caliper: <distance> (0.063)
- number of obs.: 1566 (original), 780 (matched)
- target estimand: ATT
- covariates: sex, race, age, I(age^2), education, smokeintensity, I(smokeintensity^2), smokeyrs, I(smokeyrs^2), exercise, active, wt71, I(wt71^2)
Rows: 780
Columns: 71
$ i <chr> "11", "1220", "15", "1082", "18"…
$ subclass <fct> 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6,…
$ weights <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ seqn <dbl> 428, 23045, 446, 22294, 596, 140…
$ qsmk <dbl> 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,…
$ death <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,…
$ yrdth <dbl> NA, NA, 88, NA, NA, NA, NA, NA, …
$ modth <dbl> NA, NA, 1, NA, NA, NA, NA, NA, N…
$ dadth <dbl> NA, NA, 3, NA, NA, NA, NA, NA, N…
$ sbp <dbl> 135, 159, 141, 113, 151, NA, 125…
$ dbp <dbl> 89, 91, 79, 73, 80, NA, 71, 85, …
$ sex <fct> 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,…
$ age <dbl> 43, 49, 71, 36, 48, 51, 56, 40, …
$ race <fct> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ income <dbl> 19, 22, 17, 21, 18, 22, 20, 18, …
$ marital <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,…
$ school <dbl> 12, 12, 0, 12, 12, 9, 12, 10, 17…
$ education <fct> 3, 3, 1, 3, 3, 2, 3, 2, 5, 5, 2,…
$ ht <dbl> 176.5938, 160.2812, 147.0938, 17…
$ wt71 <dbl> 63.96, 47.29, 75.64, 68.38, 62.0…
$ wt82 <dbl> 79.83226, 53.07031, 56.69905, 73…
$ wt82_71 <dbl> 15.8722571, 5.7803073, -18.94095…
$ birthplace <dbl> 42, 30, NA, 19, 36, 47, NA, 42, …
$ smokeintensity <dbl> 30, 20, 40, 9, 2, 5, 20, 5, 30, …
$ smkintensity82_71 <dbl> -30, 0, -40, 1, -2, 1, -20, 0, -…
$ smokeyrs <dbl> 24, 29, 41, 30, 30, 29, 11, 20, …
$ asthma <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ bronch <dbl> 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,…
$ tb <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ hf <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbp <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ pepticulcer <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ colitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hepatitis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ chroniccough <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,…
$ hayfever <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ diabetes <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ polio <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ tumor <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nervousbreak <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ alcoholpy <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,…
$ alcoholfreq <dbl> 3, 0, 4, 0, 1, 2, 3, 1, 3, 0, 1,…
$ alcoholtype <dbl> 3, 3, 4, 1, 2, 3, 4, 1, 2, 3, 3,…
$ alcoholhowmuch <dbl> 2, 2, NA, 6, 1, 3, NA, 5, 1, 3, …
$ pica <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ headache <dbl> 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0,…
$ otherpain <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ weakheart <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ allergies <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ nerves <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ lackpep <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ hbpmed <dbl> 0, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0,…
$ boweltrouble <dbl> 0, 2, 1, 2, 0, 0, 0, 0, 0, 2, 0,…
$ wtloss <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ infection <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ active <fct> 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 1,…
$ exercise <fct> 1, 1, 1, 0, 1, 0, 2, 0, 0, 2, 2,…
$ birthcontrol <dbl> 2, 0, 0, 2, 0, 2, 0, 0, 2, 0, 0,…
$ pregnancies <dbl> NA, 4, 15, NA, 3, NA, 4, 2, NA, …
$ cholesterol <dbl> 173, 279, 229, 200, 225, 199, 23…
$ hightax82 <dbl> 0, 0, NA, 0, 0, 0, NA, 0, 0, 0, …
$ price71 <dbl> 2.346680, 2.104980, NA, 2.199707…
$ price82 <dbl> 1.797363, 1.698242, NA, 1.847900…
$ tax71 <dbl> 1.3649902, 1.0498047, NA, 1.1022…
$ tax82 <dbl> 0.5718994, 0.4399414, NA, 0.5718…
$ price71_82 <dbl> 0.54931641, 0.40686035, NA, 0.35…
$ tax71_82 <dbl> 0.7929688, 0.6099854, NA, 0.5303…
$ id <int> 11, 1274, 15, 1135, 18, 564, 23,…
$ censored <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ older <dbl> 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1,…
$ distance <dbl> -1.1611429, -1.1626186, -1.33039…
10:00
Matching
Weighting
Stratification
Direct Adjustment
…
Average Treatment Effect (ATE)
\[\Large w_{ATE} = \frac{X_i}{p_i} + \frac{1-X_i}{1 - p_i}\]
Average Treatment Effect Among the Controls (ATC) \[\Large w_{ATC} = \frac{(1-p_i)X_i}{p_i} + \frac{(1-p_i)(1-X_i)}{(1-p_i)}\]
Average Treatment Effect Among the Overlap Population \[\Large w_{ATO} = (1-p_i)X_i + p_i(1-X_i)\]
Average Treatment Effect (ATE) \(w_{ATE} = \frac{X_i}{p_i} + \frac{1-X_i}{1 - p_i}\)
Application Exercise
10:00
Slides by Dr. Lucy D’Agostino McGowan