5-day Bootcamp Curriculum

Learn more about the immersive bootcamp curriculum

The best data science bootcamp curriculum hands-down.

Our instructors are practitioners who know what matters. We have designed the most practical short-duration curriculum that has gotten thousands of working professionals from hundreds of company globally get started with practical data science in just one week.

UnitLessonDescriptionTimelineDurationTopicsTools/LabsSampleSample Video
Data Science FundamentalsData Exploration, Visualization, and Feature EngineeringThe first and most important task of the data scientist is to understand their data. The bulk of our first day is dedicated to the theory and practice of understanding data. Through a series of interactive, hands-on exercises, we teach you how to dissect and explore data, engineer your features, and clean your data to prepare it for modeling. You will learn not just the mechanics of data exploration, but also the proper mindset, one that will help you tease out the patterns hidden in your data.Day 15 hoursExploration, Visualization, SegmentationR, Pythonhttps://datasciencedojo.com/wp-content/uploads/data_exploration_visualization_slide_sample.pdf
Data Science FundamentalsPredictive Analytics FundamentalsOur first foray into predictive analytics is guided by a deep dive into the mechanics and theory behind decision tree models. The basis of some of the most successful predictive models, decision trees provide a useful vehicle for hands-on exercises in training and testing classification models.Day 1, Day 26 hoursPredictive Analytics, Classification, Decision Trees, Gini Index, Entropy, Training/Test SplitsR, Pythonhttps://datasciencedojo.com/wp-content/uploads/predictive_classification_decision_slide_sample.pdf
Model Evaluation and Parameter TuningEvaluation and Fine Tuning of Predictive ModelsOne of the subtlest and trickiest areas of modern data science is in model evaluation. The risk of “overfitting” and producing a model that generalizes very poorly constantly hangs over the practitioner’s head. We teach you about the metrics and methods you can use to protect yourself from this danger, giving you direct, practical experience in how to tune your models for greatest effectiveness. We’ll familiarize you with the evaluation and model tuning capabilities with hands-on exercises. In addition, we teach you to understand the effects of each algorithm’s configuration parameters, and to use this knowledge to tune your models for optimal performance.Day 24 hoursAccuracy, Precision, Recall, ROC, AUC, Cross-validation, Bias/Variance Tradeoff, Model Tuning, ParametersR, Python, Azure MLhttps://datasciencedojo.com/wp-content/uploads/evaluation_classification_slide_sample.pdf
Ensemble MethodsBagging, Boosting and Random ForestAfter building a predictive model and understanding the pitfalls of wrong choice of evaluation metrics, we move to somewhat advanced learning techniques. We discuss the importance of ensemble techniques in machine learning and how they help us get machine learning models that are more generalized. The module goes in-depth into sampling with/without replacing, bootstrapped sampling, bagging, random forest and boosting.Day 2, Day 35 hoursBinomial Distribution, Bagging, Boosting, Random Forests, AdaBoostinomial Distribution, Bagging, Boosting, Random Forests, AdaBoostR, Python, Azure MLhttps://datasciencedojo.com/wp-content/uploads/ensemble_random_forest_slide_sample.pdf
Modelling Unstructured DataIntroduction to Text AnalyticsSo far, we have only dealt with fully structured data, but many applications of data science require analysis of unstructured data. We will teach you the basics of converting text into structured data, and how to model documents to find their similarities and recommend similar documents.Day 31.5 hoursUnstructured Data, Stemming, Lemmatization, Stop Words, TF-IDFR, Python, Azure MLhttps://datasciencedojo.com/wp-content/uploads/text_analytics_slide_sample.pdf
Modelling Unstructured DataUnsupervised Learning and ClusteringArguably the oldest branch of machine learning, unsupervised learning at its core is about revealing the hidden structure of any dataset. We teach you about K-Means, a popular clustering algorithm. You will als