time series

Nathan Piccini
| May 7, 2019

Learn how to evaluate time series in Python model predictions, build an ARIMA model, and evaluate predictions using mean absolute error.

If you’ve read any of my previous posts, you’ll know I don’t call myself a data scientist. I would call myself a data enthusiast. Being in marketing, I make every business decision based on what the data is telling me.

Having said that, I ran into a great 3-part tutorial series about time series in Python. It’s meant for intermediate to advanced learners, but I found it was incredibly easy to follow along (even if I had to look up some of the concepts/techniques).  Each video is between 10-15 minutes and should only take you about 45 minutes to complete.

Here are the packages used in the tutorials:

  • pandas
  • StatsModels
  • matplotlib
  • statistics

This tutorial is taught in Python. If you are more comfortable with R, the presenter has shared the R code (and Python) script Repository

Part 1: Read and Transform Your Data

In Part 1, you will learn how to read and index your data for time series, check that the data meets the requirements or assumptions for time series modeling, and transform your data to ensure it meets those requirements. You’ll primarily be using the pandas package.

The next two parts both start right where Part 1 left off. Both don’t have much of an introduction other than a really short review of what was covered in the previous section(s). If you aren’t completing each tutorial right after the other, make sure to go back and review.

Part 2: ARIMA modeling and forecasting in Python

Part 2 has you building an Arima model using the StatsModel package, predicting N timestamps into the future. In addition, you will also look at the Autocorrelation Function plot and Partial Autocorrelation Function plot to determine the terms in your time series model.

Part 3: Evaluating time series forecasts

In the final part of the (time) series, you’ll evaluate predictions using mean absolute error and Python’s statistics and matplotlib packages. You’ll plot the last five predicted and actual values, look at the differences and calculate the mean absolute error to help evaluate your ARIMA model. At the end of the video, the presenter challenges you to improve on the model she walks you through.

I don’t have the data science know-how to make improvements, but maybe you do! I encourage you to add the ways you have improved the model to the Discussion. Who knows, maybe I’ll contact you to collaborate on a follow-up blog post.

Rebecca Merrett
| May 13, 2019

Time Series Forecasting is a much humbler yet effective technique that’s been quietly building up almost a quintillion business application.

While the world is trying to force deep learning onto every problem imaginable, big or small, there’s a more effective technique to help improve business applications over time. Time Series Forecasting is not just your typical machine learning solution that learns from past data. In addition to this, it factors in time to learn that depending on the time of day/year/month/period, certain events are likely to either happen again or move in a certain direction.

Let’s pause for a moment and think about all the events in our lives that are heavily dependent on time or periods over a lifetime. From growing to aging, graduating to retiring, from one place to the next – we all have our sequence of events that took place over time in our lives. Now think about all the events we could collect data on that are also heavily dependent on time. From traveling to and from work to ordering food to purchase an item to reading your news feeds, to the Earth orbiting around the Sun. There’s a time associated with almost anything. So we can collect data on almost anything at any time interval. (I say ‘almost’ because unless you want to figure out a way to collect data inside a black hole, for example, almost everything can have a regular timestamp.)

What is time series forecasting?

Time Series data are observations on something recorded at regular time intervals (univariate/single variable series). Time Series Forecasting looks at data over time to predict what will happen in the next period, based on patterns or re-occurring trends of previous periods. These patterns could be seasonal, where there’s a periodic trend or re-occurring event, or it could be a consistent upward/downward trend over time, or it could be no recognizable pattern or trend over time.

Understand Time Series in minutes

One of the most common techniques for Time Series Forecasting is Autoregressive Integrated Moving Average (ARIMA). This technique predicts the next timestamp by both regressing and averaging over the previous data values.

Learn more about ARIMA modeling for Time Series

Application of time series in business

Besides your typical financial modeling, there’s so much more Time Series Forecasting can apply to, especially when predicting demand. It would be a disservice to only limit Time Series Forecasting to the world of finance. Here are 5 distinctly different business applications of Time Series Forecasting to give you a good sample, albeit a very small one that doesn’t even make up a fraction of the tip of the iceberg.

Scenario 1: Online users forecast

A Time Series model predicts ~600,000 people to log in online in the next few hours. The online sports streaming platform already knows there would be a lot of people online due to a big event happening then. But now it can better plan for how many additional servers and infrastructure are needed for the online platform, based on how many online users are predicted. Also, those servers are only used for that particular time of day, switching them off for the rest of the day to save money. Another Time Series model predicts a significant increase in online users from last year and even more so the year before then. The company decides it has reached a point of continuing significant growth, and now is the right time to invest in better infrastructure for the year ahead and coming years ahead.

Scenario 2: Traffic forecast

A sensor device records the number of vehicles that cross an intersection every 20 minutes. Using these counts of vehicles taken every 20 minutes, a Time Series model predicts that in the next 20 minutes traffic at the intersection is likely to spike by a huge amount. Now your trip planning app decides to re-route you to avoid this congested, problematic intersection, distributing the traffic load more evenly across roads.

Scenario 3: Customer satisfaction forecast

Customer reviews are web scraped and analyzed every day, with an overall score on their sentiment that shows whether they are happy with the company or upset. A number anywhere from -1 (most upset) to +1 (most happy) is recorded each day. The company has seen a turn of events where the score has started to drop gradually over time. The company is thinking about whether it should hold off on acting on this right away and save time and resources, as it might turn around to be positive again. A Time Series Forecast says it is not likely to get any better, and in the next few days it will continue to drop to a score that is unacceptable for the company. Before potentially reaching that point, the company has decided, based on this model, to pull in extra resources to assist the customer service team so that they can pay extra special attention to customers and hopefully turn that trend in the right direction.

Scenario 4: User spending habits forecast

An online retail site usually sees periods of peaks and falls in its sales. However, ordering stock during peak times is far more expensive than during off-peak times. Last year during the peak season the retailers had to order extra stock at a premium price because they underestimated purchasing demand for their product. This year the retailer is wiser in that it is using forecasting models to predict or get a closer estimation of how much stock is needed for the next peak season ahead, instead of using their best guest. The model says their best guess for this year is still underestimating how many customer purchases are to come during peak season. The retailer decides to order extra stock than expected during the off-season before it becomes exuberantly expensive to order extra stock during on season.

Scenario 5: Staff turnover forecast

Many factors can lead to staff turnover, but have you ever noticed that there might be certain months in the year where turnover is a bit higher than others? Recruitment agencies use the New Year to target New Year career goals and pay people or leaving a company during a less busy period allows an employee to change jobs smoothly. Sourcing new hires and onboarding can be time-consuming and expensive. A company would like to see what a Time Series model predicts in terms of which months they could expect a higher turnover so that they can implement employee retention plans before then.

What are some business applications that you can think of where Time Series Forecasting would be useful?

To learn how to build a Time Series model and forecast, watch this video series.

Related Topics

Web Development
Software Testing
Programming Language
Natural Language
Machine Learning
Hypothesis Testing
Data Visualization
Data Security
Data Science
Data Mining
Data Engineering
Data Analytics

Up for a Weekly Dose of Data Science?

Subscribe to our weekly newsletter & stay up-to-date with current data science news, blogs, and resources.