Modern machine learning libraries make model building look deceptively easy. An unnecessary emphasis (admittedly, annoying to the speaker) on tools like R, Python, SparkML, and techniques like deep learning is prevalent. Relying on tools and techniques while ignoring the fundamentals is the wrong approach to model building.
Real-world machine learning requires hard work, discipline and rigor. Development of robust models requires due diligence during data acquisition phase and an obsession with data quality.
Feature engineering, choice of evaluation metrics and an understanding of the model bias/variance trade-off is often more important than the choice of tools. Experienced machine learning engineers spend most of their time dealing with data-related issues, model evaluation and parameter tuning while spending only a fraction of their time in actual model building. This is the 80/20 rule.
Unlike most talks these days, this talk is not about deep learning. We will ignore the hype and strictly focus on fundamentals of building robust machine learning models.