Moving Average
Nathan Kamgang
When we look at data, we are often not interested in every individual fluctuation. We want to see the bigger picture: is this growing? Is this declining? Where is it heading? The moving average is one of the simplest tools we have to answer that question. Before we define it formally, let us build the intuition from the ground up.
Consider those three data points: 10, 20, 10
20 is like a bump or big increase in our data. Now average those three values.
$$10 + 20 + 10 = 40$$$$\text{Average} = \frac{40}{3} \approx 13.3$$That average value of 13.3 seems less bumpy than our original data, which went all the way up to 20. Such a simple example suffices to illustrate a key property of the average: it absorbs variability in the data, smoothing out the peaks and valleys.
Now consider a slightly more complex sequence:
$$3, 7, 5, 10, 8, 13, 11, 16, 14, 19$$This data moves up and down quite a bit. To get a clearer picture of what is going on, we can compute a moving average. The idea is straightforward: instead of averaging the whole dataset at once, we slide a window of fixed size across it and compute a local average at each position. With a window of size 3, each smoothed value is simply the average of three consecutive points.
Here is the computation step by step:
$$MA_1 = \frac{3+7+5}{3} \approx 5.0$$$$MA_2 = \frac{7+5+10}{3} \approx 7.3$$$$MA_3 = \frac{5+10+8}{3} \approx 7.7$$$$MA_4 = \frac{10+8+13}{3} \approx 10.3$$$$MA_5 = \frac{8+13+11}{3} \approx 10.7$$$$MA_6 = \frac{13+11+16}{3} \approx 13.3$$$$MA_7 = \frac{11+16+14}{3} \approx 13.7$$$$MA_8 = \frac{16+14+19}{3} \approx 16.3$$Placing both series side by side:
$$\text{Original}: \quad 3,\ 7,\ 5,\ 10,\ 8,\ 13,\ 11,\ 16,\ 14,\ 19$$$$\text{Moving Average}: \quad 5.0,\ 7.3,\ 7.7,\ 10.3,\ 10.7,\ 13.3,\ 13.7,\ 16.3$$What do you notice when you compare our moving average with our original data?
The moving average unambiguously increases while the original data seems to move up and down more erratically. The moving average revealed the overall trend in the data while hiding short term variation, all because of the property of the average to absorb noise and variability, those bumps, as we first described them.
Now consider quarterly sales data for a toy company (in thousands of units):
$$ \begin{array}{|c|c|c|} \hline \textbf{Quarter} & \textbf{Sales} & \textbf{Moving Average} \\ \hline \text{Q1 2022} & 12 & \\ \text{Q2 2022} & 15 & \\ \text{Q3 2022} & 14 & \\ \text{Q4 2022} & 38 & 19.8 \\ \text{Q1 2023} & 16 & 20.8 \\ \text{Q2 2023} & 19 & 21.8 \\ \text{Q3 2023} & 18 & 24.5 \\ \text{Q4 2023} & 45 & 25.0 \\ \hline \end{array} $$We apply a moving average with a window of 4 quarters, since our seasonal cycle repeats every 4 periods. The moving average increases steadily and smoothly, while the original data swings up and down dramatically each time the holiday season arrives. The moving average absorbed those sharp seasonal bumps and revealed the underlying growth in sales, exactly as it did in our earlier example with the noisy data. The tool is the same, and so is the reason it works.