flowchart LR A{Ice-T} --> |observed| B(Abandons criminal life) A -.-> |missing counterfactual| C(Does one more heist) C -.-> D[35 years in prison] B --> E[Fame & Fortune] classDef grey fill:#fff class D,C grey
Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth
— Robert Frost
Award-winning actor, rapper, and producer Ice-T unveils a compelling memoir of his early life robbing jewelry stores until he found fame and fortune—while a handful of bad choices sent his former crime partner down an incredibly different path.
flowchart LR A{Ice-T} --> |observed| B(Abandons criminal life) A -.-> |missing counterfactual| C(Does one more heist) C -.-> D[35 years in prison] B --> E[Fame & Fortune] classDef grey fill:#fff class D,C grey
flowchart LR A{Spike} -.-> |missing counterfactual| B(Abandons criminal life) A --> |observed| C(Does one more heist) C --> D[35 years in prison] B -.-> E[Fame & Fortune] classDef grey fill:#fff class E,B grey
What is the average causal effect?
data <- tibble(
id = 1:10,
y_chocolate = c(4, 4, 6, 5, 6, 5, 6, 7, 5, 6),
y_vanilla = c(1, 3, 4, 5, 5, 6, 8, 6, 3, 5)
)
data <- data |>
mutate(causal_effect = " ")
data |>
gt() |>
cols_label(
id = "ID",
y_chocolate = md("$Y_{\\text{id}}(\\text{chocolate})$"),
y_vanilla = md("$Y_{\\text{id}}(\\text{vanilla})$"),
causal_effect = md("$Y_{\\text{id}}(\\text{chocolate}) - Y_{\\text{id}}(\\text{vanilla})$")
) |>
fmt_markdown(
columns = c(y_chocolate, y_vanilla, causal_effect)
) |>
tab_header(
title = md("**Potential Outcomes and Causal Effect**")
) |>
tab_spanner(
label = "Potential Outcomes",
columns = c(y_chocolate, y_vanilla)
) |>
tab_spanner(
label = "Causal Effect",
columns = causal_effect
)
Potential Outcomes and Causal Effect | |||
---|---|---|---|
ID | Potential Outcomes | Causal Effect | |
\(Y_{\text{id}}(\text{chocolate})\) | \(Y_{\text{id}}(\text{vanilla})\) | \(Y_{\text{id}}(\text{chocolate}) - Y_{\text{id}}(\text{vanilla})\) | |
1 | 4 |
1 |
|
2 | 4 |
3 |
|
3 | 6 |
4 |
|
4 | 5 |
5 |
|
5 | 6 |
5 |
|
6 | 5 |
6 |
|
7 | 6 |
8 |
|
8 | 7 |
6 |
|
9 | 5 |
3 |
|
10 | 6 |
5 |
What is the average causal effect?
## we are doing something *random* so let's
## set a seed so we always observe the
## same result each time we run the code
set.seed(11)
data_observed <- data |>
mutate(
# change the exposure to randomized, generated from
# a binomial distribution with a probability of 0.5 for
# being in either group
exposure = if_else(
rbinom(n(), 1, 0.5) == 1, "chocolate", "vanilla"
),
observed_outcome = case_when(
exposure == "chocolate" ~ y_chocolate,
exposure == "vanilla" ~ y_vanilla
)
)
avg_chocolate <- data_observed |>
filter(exposure == "chocolate") |>
pull(observed_outcome) |>
mean()
avg_vanilla <- data_observed |>
filter(exposure == "vanilla") |>
pull(observed_outcome) |>
mean()
data_observed |>
mutate(
y_chocolate = if_else(exposure == "chocolate", y_chocolate, NA),
y_vanilla = if_else(exposure == "vanilla", y_vanilla, NA),
causal_effect = NA_real_
) |>
select(-observed_outcome, -exposure) |>
gt() |>
cols_label(
id = "ID",
y_chocolate = md("$Y_{\\text{id}}(\\text{chocolate})$"),
y_vanilla = md("$Y_{\\text{id}}(\\text{vanilla})$"),
causal_effect = md("$Y_{\\text{id}}(\\text{chocolate}) - Y_{\\text{id}}(\\text{vanilla})$")
) |>
fmt_markdown(columns = c(y_chocolate, y_vanilla, causal_effect)) |>
sub_missing(
columns = c(y_chocolate, y_vanilla, causal_effect),
missing_text = md("---") # Format missing values as blank
) |>
tab_header(
title = md("**Potential Outcomes and Hidden Causal Effect**")
) |>
tab_spanner(
label = "Potential Outcomes",
columns = c(y_chocolate, y_vanilla)
) |>
tab_spanner(
label = "Causal Effect",
columns = causal_effect
)
Potential Outcomes and Hidden Causal Effect | |||
---|---|---|---|
ID | Potential Outcomes | Causal Effect | |
\(Y_{\text{id}}(\text{chocolate})\) | \(Y_{\text{id}}(\text{vanilla})\) | \(Y_{\text{id}}(\text{chocolate}) - Y_{\text{id}}(\text{vanilla})\) | |
1 | — | 1 |
— |
2 | — | 3 |
— |
3 | 6 |
— | — |
4 | — | 5 |
— |
5 | — | 5 |
— |
6 | 5 |
— | — |
7 | — | 8 |
— |
8 | — | 6 |
— |
9 | 5 |
— | — |
10 | — | 5 |
— |
# A tibble: 2 × 2
exposure avg_outcome
<chr> <dbl>
1 chocolate 5.33
2 vanilla 4.71
Why did that (approximately) work?
Slides by Dr. Lucy D’Agostino McGowan