Modern machine learning libraries make model-building look deceptively easy. An unnecessary emphasis (admittedly, annoying to the speaker) on tools like R, Python, SparkML, and techniques like deep learning is prevalent. Relying on tools and techniques while ignoring the fundamentals is the wrong approach to model building.
Real-world machine learning requires hard work, discipline, and rigor. The development of robust models requires due diligence during the data acquisition phase and an obsession with data quality.
Experienced machine learning engineers spend most of their time dealing with data-related issues, model evaluation, and parameter tuning while spending only a fraction of their time in actual model building. This is the 80/20 rule.
Unlike most talks these days, this talk is not about deep learning. We will ignore the hype and strictly focus on the fundamentals of building robust machine learning models.
What you’ll learn
CEO and Chief Data Scientist at Data Science Dojo