Causal questions are part of a broader set of questions we can ask with statistical techniques related to the primary tasks of data science:
Haber et al. 2022. “Causal and Associational Language in Observational Health Research: A Systematic Evaluation.” Am J Epidemiol 191 (12): 2084–97.
Haber et al. 2022. “Causal and Associational Language in Observational Health Research: A Systematic Evaluation.” Am J Epidemiol 191 (12): 2084–97.
Haber et al. 2022. “Causal and Associational Language in Observational Health Research: A Systematic Evaluation.” Am J Epidemiol 191 (12): 2084–97.
Haber et al. 2022. “Causal and Associational Language in Observational Health Research: A Systematic Evaluation.” Am J Epidemiol 191 (12): 2084–97.
What phenomena occur / occurred in the past?
What phenomena occur / occurred in the past?
Validity concerns: Measurement error, sampling error
Connection to causal inference: Understanding population characteristics, examining outcome distributions, checking if data structure matches research question
What phenomena occur / occurred in the past?
Validity concerns: Measurement error, sampling error
Connection to causal inference: Understanding population characteristics, examining outcome distributions, checking if data structure matches research question
Whether a certain phenomena will occur given a set of circumstances
Validity concerns: Predictive accuracy, measurement error
Connection to causal inference: Some techniques use model predictions to answer causal questions
What phenomena occur / occurred in the past?
Validity concerns: Measurement error, sampling error
Connection to causal inference: Understanding population characteristics, examining outcome distributions, checking if data structure matches research question
Whether a certain phenomena will occur given a set of circumstances
Validity concerns: Predictive accuracy, measurement error
Connection to causal inference: Some techniques use model predictions to answer causal questions
Why does a phenomena occur
Validity concerns: Lots of assumptions (many that cannot be checked, coming soon!)
Predictive power doesn’t guarantee causal accuracy, especially when:
Variables that are invalid from a causal perspective (like ice cream sales) can still provide predictive power by acting as proxies for true causal factors (like weather)
Smoking causes lung cancer
For people who smoking 15+cigarettes a day, reducing smoking by 50% reduces the risk of lung cancer over 5-10 years
Does smoking causes lung cancer?
For people who smoking 15+cigarettes a day, reducing smoking by 50% reduces the risk of lung cancer over 5-10 years
For people who smoke 15+ cigarettes a day, does reducing smoking by 50% reduce the lung cancer risk over 5-10 years?
For people who smoking 15+cigarettes a day, reducing smoking by 50% reduces the risk of lung cancer over 5-10 years
Slides by Dr. Lucy D’Agostino McGowan