Lessons for Machine Learning from Econometrics
Last Updated on August 15, 2020
Hal Varian is the chief economist at Google and gave a talk to Electronic Support Group at EECS Department at the University of California at Berkeley in November 2013.
The talk was titled Machine Learning and Econometrics and was really focused on what lessons the machine learning can take away from the field of Econometrics.
Hal started out by summarizing a recent paper of his titled “Big Data: New Tricks for Econometrics” (PDF) which comments on what the econometrics community can learn from the machine learning community, namely:
- Train-test-validate to avoid overfitting
- Cross validation
- Nonlinear estimation (trees, forests, SVMs, neural nets, etc)
- Bootstrap, bagging, boosting
- Variable selection (lasso and friends)
- Model averaging
- Computational Bayesian methods (MCMC)
- Tools for manipulating big data (SQL, NoSQL databases)
- Textual analysis (not discussed)
He continued by talking about non-i.i.d data such as time series data and panel data. This is data where cross validation typically does not perform well. He suggests decomposing data trend + seasonal components and look at deviations from expected behavior. An example
To finish reading, please visit source site