Causal Inference in Python: Applying Causal Inference in the Tech Industry (Fourth Early Release) [4 ed.] 9781098140250, 9781098140199

This book is an introduction to Causal Inference in Python, but it is not an introductory book in general. It’s introduc

586 129 12MB

English Pages 561 Year 2023

Report DMCA / Copyright

DOWNLOAD EPUB FILE

Table of contents :
Preface
Prerequisites
Data and Code
Outline
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
I. Fundamentals
1. Introduction To Causal Inference
What is Causal Inference
Why we Do Causal Inference
Machine Learning and Causal Inference
Association and Causation
The Treatment and the Outcome
The Fundamental Problem of Causal Inference
Causal Models
Interventions
Individual Treatment Effect
Potential Outcomes
Consistency and Stable Unit Treatment Values
Causal Quantities of Interest
Causal Quantities: An Example
Bias
The Bias Equation
A Visual Guide to Bias
Identifying the Treatment Effect
The Independence Assumption
Identification with Randomization
Key Ideas
2. Randomized Experiments and Stats Review
Brute Force Independence with Randomization
An A/B Testing Example
The Ideal Experiment
The Most Dangerous Equation
The Standard Error of Our Estimates
Confidence Intervals
Hypothesis Testing
Null Hypothesis
Test Statistic
P-values
Power
Sample Size Calculation
Key Ideas
3. Graphical Causal Models
Thinking About Causality
Visualizing Causal Relationships
Are Consultants Worth it?
Crash Course in Graphical Models
Chains
Forks
Immorality or Collider
The Flow of Association Cheat Sheet
Querying a Graph in Python
Identification Revisited
CIA and The Adjustment Formula
Positivity Assumption
An Identification Example with Data
Confounding Bias
Surrogate Confounding
Randomization Revisited
Selection Bias
Conditioning on a Collider
Adjusting for Selection Bias
Conditioning on a Mediator
Key Ideas
II. Adjusting for Bias
4. The Unreasonable Effectiveness of Linear Regression
All You Need is Linear Regression
Why We Need Models
Regression in A/B Tests
Adjusting with Regression
Regression Theory
Single Variable Linear Regression
Multivariate Linear Regression
Frisch-Waugh-Lovell Theorem and Orthogonalization
Debiasing Step
Denoising Step
Standard Error of the Regression Estimator
Final Outcome Model
FWL Summary
Regression as an Outcome Model
Positivity and Extrapolation
Non-Linearities in Linear Regression
Linearizing the Treatment
Non-Linear FWL and Debiasing
Regression for Dummies
Conditionally Random Experiments
Dummy Variables
Saturated Regression Model
Regression as Variance Weighted Average
De-Meaning and Fixed Effects
Omitted Variable Bias: Confounding Through the Lens of Regression
Neutral Controls
Noise Inducing Control
Feature Selection: A Bias-Variance Trade-Off
Key Ideas
5. Propensity Score
The Impact of Management Training
Adjusting with Regression
Propensity Score
Propensity Score Estimation
Propensity Score and Orthogonalization
Propensity Score Matching
Inverse Propensity Weighting
Variance of IPW
Stabilized Propensity Weights
Pseudo-Populations
Selection Bias
Bias-Variance Trade-Off
Positivity
Design vs Model-Based Identification
Doubly Robust Estimation
Treatment is Easy to Model
Outcome is Easy to Model
Generalized Propensity Score for Continuous Treatment
Keys Ideas
III. Effect Heterogeneity and Personalization
6. Effect Heterogeneity
From ATE to CATE
Why Prediction is not the Answer
CATE with Regression
Evaluating CATE Predictions
Effect by Model Quantile
Cumulative Effect
Cumulative Gain
Target Transformation
When Prediction Models are Good for Effect Ordering
Marginal Decreasing Returns
Binary Outcomes
CATE for Decision Making
Key Ideas
7. Meta-Learners
Meta-Learners for Discrete Treatments
T-Learner
X-Learner
Meta-Learners for Continuous Treatments
S-Learner
Double/Debiased Machine Learning
Key Ideas
IV. Panel Data
8. Difference-in-Differences
Panel Data
Canonical Difference-in-Differences
Diff-in-Diff with Outcome Growth
Diff-in-Diff with OLS
Diff-in-Diff with Fixed Effects
Multiple Time Periods
Inference
Identification Assumptions
Parallel Trends
No Anticipation Assumption and SUTVA
Strict Exogeneity
No Time Varying Confounders
No Feedback
No Carryover and no Lagged Dependent Variable
Effect Dynamics Over Time
Diff-in-Diff with Covariates
Doubly Robust Diff-in-Diff
Propensity Score Model
Delta Outcome Model
All Together Now
Staggered Adoption
Heterogeneous Effect over Time
Covariates
Key Ideas
9. Synthetic-Control
Online Marketing Dataset
Matrix Representation
Synthetic Control as Horizontal Regression
Canonical Synthetic Control
Synthetic Control with Covariants
Debiasing Synthetic Control
Inference
Synthetic Difference-in-Differences
DID Refresher
Synthetic Controls Revisited
Estimating Time Weights
Synthetic Control and DID
Key Ideas
V. Alternative Experimental Designs
10. Geo and Switchback Experiments
Geo-Experiments
Synthetic Control Design
Trying a Random Set of Treated Units
Random Search
Switchback Experiment
Potential Outcomes of Sequences
Estimating the Order of Carryover Effect
Design Based Estimation
Optimal Switchback Design
Robust Variance
Key Ideas
11. Non-Compliance and Instruments
Non-Compliance
Extending Potential Outcomes
Instrument Identification Assumptions
First Stage
Reduced Form
Two Stage Least Squares
Standard Error
Additional Controls and Instruments
2SLS by Hand
Matrix Implementation
Discontinuity Design
Discontinuity Design Assumptions
Intention to Treat Effect
The IV Estimate
Bunching
Key Ideas
12. Next Steps
Causal Discovery
Sequential Decision Making
Causal Reinforcement Learning
Causal Forecasting
Domain Adaptation
Closing Thoughts
Index
About the Author

Causal Inference in Python: Applying Causal Inference in the Tech Industry (Fourth Early Release) [4 ed.]
 9781098140250, 9781098140199

  • Commentary
  • raw & unedited
  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
Recommend Papers