Home > Exponential Smoothing for Time Series Forecasting

# Exponential Smoothing for Time Series Forecasting

Time series anomaly detection is a complicated problem with plenty of practical methods. It’s easy to find yourself getting lost in all of the topics it encompasses. Learning them is certainly an issue, but implementing them is often more complicated. A key element of anomaly detection is forecasting—taking what you know about a time series, either based on a model or its history, and making decisions about values that arrive later. You know how to do this already. Imagine someone asked you to forecast the prices for a certain stock, or the local temperature over the next few days. You could draw out your prediction, and chances are it’s a pretty good one. Your brain works amazingly well for problems like this, and our challenge is to try to get computers to do the same. If you take an introductory course on time series, you’ll learn how to forecast by fitting a model to some sample data, and then using the model to predict future values. In practice, especially when monitoring systems, you’ll find this approach doesn’t work well, if at all! Real systems rarely fit mathematical models. There is an alternative. You can do something a lot simpler with exponential smoothing. First, let’s look at what kinds of time series we could be working with. Suppose you measured the cpu.idle metric on a system and have the observations plotted below. In this case, the time series isn’t particularly interesting. The values vary a reasonable amount, but overall it’s fairly stable and most values hover around 130 or so. From a time series analysis perspective, this is considered to be fairly stationary. If you tried to predict the next value, your best guess would probably be around 130. It’s impossible to be exactly right with a prediction like this, but picking a value like 130 would appear to be the least incorrect.

## Smoothing

Exponential smoothing refers to the use of an exponentially weighted moving average (EWMA) to “smooth” a time series. Here’s a plot of a stationary time series along with a couple of smoothed versions. Notice how the smaller the weight, the less influence each point has on the smoothed time series. Suppose you had your time series along with a smoothed version and you’d like to predict, or forecast, the next value. This is simpler than you may think! You can just use the last value you calculated for the EWMA. It works out this way because our smoothed time series is the EWMA of our original series, and because of the way averages (and expectations) work, it turns out to be a really good prediction. Predicting the next value is called the one-step-ahead forecast. This method doesn’t always work well. Remember, you made an important assumption for this time series: it’s stationary. What happens when it isn’t?

## Stationarity, Trend, and Seasonality

There are many ways to characterize a time series, but we’ll focus on three simple, closely related ones: stationarity, trend, and seasonality. Stationarity refers to how stable the values of a time series are. For simplicity, let’s just say we consider a time series to be stationary if it has a constant mean. A stationary time series won't have any kind of increasing or decreasing pattern, and its points will generally hover around the same value, the mean. Because of this characteristic, a simple EWMA, which estimates the mean, is so helpful for forecasts. Trend refers to a long-term movement of a time series in a particular direction. With linear trend, time series points will approximately follow a line. It’s also possible to have higher order trends, such as quadratic trend where points follow a parabola. Seasonality refers to a periodic pattern. A great example of a seasonal time series is the temperature in a particular location. A time series can have multiple seasons with different periods. The Keeling Curve, which plots the measured concentration of CO2 in the atmosphere, has a positive trend and seasonality. You may notice something interesting going on with the smoothed series with the lower weight. It tends to lag behind our original data because more recent values have lower influence. This is especially noticeable with the seasonal time series. This is important! Because you’re using the smoothed values to forecast, any significant deviation in the smoothed values will throw off your prediction. If you notice your time series isn't stationary, you’ll have to find something other than a simple EWMA to do your forecasting.

## Double and Triple Exponential Smoothing

In the late 1950s, Charles Holt recognized the issue with the simple EWMA model with time series with trend. He modified the simple exponential smoothing model to account for a linear trend. This is known as Holt’s exponential smoothing. This model is a little more complicated. It consists of two EWMAs: one for the smoothed values, and another for its slope. The terms level and trend are also used. The smoothed values are much better at following the original time series with double exponential smoothing. This means you’ll get much better forecasts. To forecast with this model, you have to make a slight adjustment. Because there is another term for the slope, you’ll have to consider as much in the forecast. Suppose you’re trying to forecast the value in m time steps in the future. It’s essentially the formula for a line. However, what if your time series doesn’t have a linear trend, but rather some sort of seasonality? For this, you’ll need yet another EWMA. Holt’s student, Peter Winters, extended his teacher’s model by introducing an additional term to factor in seasonality. This model, with level, trend, and seasonal components, is known as Holt-Winters. It is also referred to as triple exponential smoothing. Now there’s another variable, which depends on the period of the seasonality and has to be known in advance.

## Summary

Real-time anomaly detection is really a forecasting problem, since you can’t know what to expect in the present unless you use the past to forecast. Forecasting time series data can get really sophisticated and complicated, but a lot of simple and efficient techniques like an EWMA can give most of the benefit with a small fraction of the cost, effort, and complexity. More complex techniques can be good for very specific cases, but come at the cost of losing generality and requiring a lot more tweaking and parameter selection, which can be surprisingly delicate to do well.

Updated 6/22/2017

SolarWinds
We’re Geekbuilt.® Developed by network and systems engineers who know what it takes to manage today's dynamic IT environments, SolarWinds has a deep connection to…

### Tweets

SolarWinds
@solarwinds

Read about sustainability in IT and data centers in this @DigiconAsia article by SolarWinds Head Geek… t.co/1WMQ9t69c3

SolarWinds
@solarwinds