Introduction to Machine Learning with R & Caret

Agenda

The R programming language is experiencing a rapid increase in popularity and wide adoption across industries. This popularity is due, in part, to R’s huge collection of open-source machine learning algorithms. If you are a data scientist working with R, the caret package (short for [C]lassification [A]nd [RE]gression [T]raining) is a must-have tool in your toolbelt. The package provides capabilities that are ubiquitous in all stages of the data science project lifecycle. Most important of all, it provides a common interface for training, tuning, and evaluating more than 200 machine learning algorithms. Not surprisingly, the caret is a sure-fire way to accelerate your velocity as a data scientist.
In this presentation, Dave Langer will provide an introduction to this package. The focus of the presentation will be using caret to implement some of the most common tasks of the data science project lifecycle and to illustrate incorporating it into your daily work.

What you’ll learn

  • Create stratified random samples of data useful for training machine learning models
  • Train machine learning models using a common interface
  • Leverage the powerful features for cross-validation and hyperparameter tuning
  • Scale caret via the use of multi-core parallel training
  • Increase their knowledge of the many features
avatar

Data Science Dojo Instructor

Data Science Dojo is a paradigm shift in data science learning. We enable all professionals (and students) to extract actionable insights from data.

We are looking for passionate people willing to cultivate and inspire the next generation of leaders in tech, business, and data science. If you are one of them get in touch with us!

Resources

R code and accompanying dataset can be found here
Package website: http://topepo.github.io/caret/index.html

Become a Presenter