Sunday, October 2, 2016

Machine Learning vs. Econometrics, I

[If you're reading this in email, remember to click through on the title to get the math to render.]

Machine learning (ML) is almost always centered on prediction; think "\(\hat{y}\)".   Econometrics (E) is often, but not always, centered on prediction.  Instead it's also often interested on estimation and associated inference; think "\(\hat{\beta}\)".

Or so the story usually goes. But that misses the real distinction. Both ML and E as described above are centered on prediction.  The key difference is that ML focuses on non-causal prediction (if a new person \(i\) arrives with covariates \(X_i\), what is my minimium-MSE guess of her \(y_i\)?), whereas the part of econometrics highlighted above focuses on causal prediction (if I intervene and give person \(i\) a certain treatment, what is my minimum-MSE guess of \(\Delta y_i\)?).  
It just happens that, assuming linearity, a "minimum-MSE guess of \(\Delta y_i\)" is the same as a "minimum-MSE estimate of \(\beta_i\)".

So there is a ML vs. E distinction here, but it's not "prediction vs. estimation" -- it's all prediction.  Instead, the issue is non-causal prediction vs. causal prediction.

But there's another ML vs. E difference that's even more fundamental.  TO BE CONTINUED...