fbpx
Learn to build large language model applications: vector databases, langchain, fine tuning and prompt engineering. Learn more

Math

Data Science Dojo
Dave Langer
| May 2

At some point, every aspiring data scientist has to get familiar with mathematics for machine learning.

To be blunt, the more serious you are about learning data science, the more math you’ll need to learn for machine learning. If you have a strong math background, this is likely to be a little issue.

In my case, I’ve had to relearn much of mathematics (note – I’m not done yet!) that I took at a university as my professional life had allowed my math skills to atrophy.

Based on my experience teaching our Bootcamp there is also a group of aspiring data scientists that fall into a category where their formal math training needs to be augmented. For example, we have many students that come from marketing backgrounds where, for example, studying linear algebra was never a requirement.

What math skills do data scientists need in machine learning

Forms of the question “what math do I need for data science” and “what math do I need for machine learning” are popular on sites like Quora. I would encourage all aspiring data scientists to perform their own research on this subject and not to take my post as gospel. However, as I often get asked for my opinion on what math aspiring data scientists need to know/study, I will provide my own list:

  • Basic statistics and probability (e.g., normal and student’s t distributions, confidence intervals, t-tests of significance, p-values, etc.).
  • Linear algebra (e.g., eigenvectors)
  • Single variable calculus (e.g., minimization/maximization using derivatives).
  • Multivariate calculus (e.g., minimization/maximization with gradients).

Please note that the above is not an exhaustive list. To be honest, you likely can never know enough math to help you as a data scientist. What I would argue is the above list represents the 80/20 rule – the 20% of math that you will use 80% of the time as a practicing data scientist.

A list of top math resources

Here’s my list of the top 80/20 math resources for aspiring data scientists:

CartoonStats
The cartoon guide to statistics by Larry Gonick

The Cartoon Guide to Statistics is one of the books we provide to our bootcamp students, and it is an excellent resource for gently learning – or refreshing – your statistical knowledge.  It covers many of the basic concepts in statistics in easy-to-consume and an entertaining fashion. Well worth a read.

openintro_statistics
3rd edition of Open Intro statistics book by David, Christopher, and Mine

Coursera’s Statistics with R Specialization is necessary for every aspiring data scientist. The accompanying textbook (pictured to the left) is also a great read. I liked the book so much I picked up a hard copy from Amazon.

MathForEcon
Book about mathematics and economics

Interestingly, I’ve found that University of California Irvine’s free UCI Open course Math 4: Math for Economists is a most excellent resource for focusing on the specific aspects of linear algebra and multivariate calculus needed for aspiring data scientists.

The accompanying textbook is also quite good and covers several interesting subjects, including single variable calculus for folks that need a refresher.

The takeaway

Studying the above resources will allow you to go a long way in developing the math skills required for data science.

For example, you will be well-prepared to study books like Intro to Statistical LearningElements of Statistical Learning, and Applied Predictive Modeling, including all the mathematics related to the algorithms.

Until next time! I wish happy data sleuthing!

 

Related Topics

Statistics
Resources
Programming
Machine Learning
LLM
Generative AI
Data Visualization
Data Security
Data Science
Data Engineering
Data Analytics
Computer Vision
Career
Artificial Intelligence