# Analyzing Complex Seasonal Patterns

When time series data are available in frequencies higher than quarterly or monthly, many forecasting programs hit a wall in analyzing seasonal effects.

Researchers from the Australian Monash University published an interesting paper in the Journal of the American Statistical Association (JASA), along with an R program, to handle this situation – what can be called “complex seasonality.”

I’ve updated and modified one of their computations – using weekly, instead of daily, data on US conventional gasoline prices – and find the whole thing pretty intriguing.

If you look at the color codes in the legend below the chart, it’s a little easier to read and understand.

Here’s what I did.

I grabbed the conventional weekly US gasoline prices from FRED. These prices are for “regular” – the plain vanilla choice at the pump. I established a start date of the first week in 2000, after looking the earlier data over. Then, I used tbats(.) in the Hyndman R Forecast package which readers familiar with this site know can be downloaded for use in the open source matrix programming language R.

Then, I established an end date for a time series I call newGP of the first week in 2012, forecasting ahead with the results of applying tbats(.) to the historic data from 2000:1 to 2012:1 where the second number refers to weeks which run from 1 to 52. Note that some data scrubbing is needed to shoehorn the gas price data into 52 weeks on a consistent basis. I averaged “week 53” with the nearest acceptable week (either 52 or 1 in the next year), and then got rid of the week 53’s.

The forecast for 104 weeks is shown by the solid red line in the chart above.

This actually looks promising, as if it might encode some useful information for, say, US transportation agencies.

A draft of the JASA paper is available as a PDF download. It’s called Forecasting time series with complex seasonal patterns using exponential smoothing and in addition to daily US gas prices, analyzes daily electricity demand in Turkey and bank call center data.

I’m only going part of the way to analyzing the gas price data, since I have not taken on daily data yet. But the seasonal pattern identified by tbats(.) from the weekly data is interesting and is shown below.

The weekly frequency may enable us to “get inside” a mid-year wobble in the pattern with some precision. Judging from the out-of-sample performance of the model, this “wobble” can in some cases be accentuated and be quite significant.

Trignometric series fit to the higher frequency data extract the seasonal patterns in tbats(.), which also features other advanced features, such as a capability for estimating ARMA (autoregressive moving average) models for the residuals.

I’m not fully optimizing the estimation, but these results are sufficiently strong to encourage exploring the toggles and switches on the routine.

Another routine which works at this level of aggregation is the stlf(.) routine. This is uses STL decomposition described in some detail in Chapter 36 Patterns Discovery Based on Time-Series Decomposition in a collection of essays on data mining.

Thoughts

Good forecasting software elicits sort of addictive behavior, when initial applications of routines seem promising. How much better can the out-of-sample forecasts be made with optimization of the features of the routine? How well does the routine do when you look at several past periods? There is even the possibility of extracting further information from the residuals through bootstrapping or bagging at some point. I think there is no other way than exhaustive exploration.

The payoff to the forecaster is the amazement of his or her managers, when features of a forecast turn out to be spot-on, prescient, or what have you – and this does happen with good software. An alternative, for example, to the Hyndman R Forecast package is the program STAMP I also am exploring. STAMP has been around for many years with a version running – get this – on DOS, which appears to have had more features than the current Windows incarnation. In any case, I remember getting a “gee whiz” reaction from the executive of a regional bus district once, relating to ridership forecasts. So it’s fun to wring every possible pattern from the data.

# Inflation/Deflation – 3

Business forecasters often do not directly forecast inflation, but usually are consumers of inflation forecasts from specialized research organizations.

But there is a level of literacy that is good to achieve on the subject – something a quick study of recent, authoritative sources can convey.

A good place to start is the following chart of US Consumer Price Index (CPI) and the GDP price index, both expressed in terms of year-over-year (yoy) percentage changes. The source is the St. Louis Federal Reserve FRED data site.

The immediate post-WW II period and the 1970;s and 1980’s saw surging inflation. Since somewhere in the 1980’s and certainly after the early 1990’s, inflation has been on a downward trend.

Some Stylized Facts About Forecasting Inflation

James Stock and Mark Watson wrote an influential NBER (National Bureau of Economic Research) paper in 2006 titled Why Has US Inflation Become Harder to Forecast.

These authors point out that the rate of price inflation in the United States has become both harder and easier to forecast, depending on one’s point of view.

On the one hand, inflation (along with many other macroeconomic time series) is much less volatile than it was in the 1970s or early 1980s, and the root mean squared error of naïve inflation forecasts has declined sharply since the mid-1980s. In this sense, inflation has become easier to forecast: the risk of inflation forecasts, as measured by mean squared forecast errors (MSFE), has fallen.

On the other hand, multivariate forecasting models inspired by economic theory – such as the Phillips curve –lose ground to univariate forecasting models after the middle 1980’s or early 1990’s. The Phillips curve, of course, postulates a tradeoff between inflation and economic activity and is typically parameterized in inflationary expectations and the gap between potential and actual GDP.

A more recent paper Forecasting Inflation evaluates sixteen inflation forecast models and some judgmental projections. Root mean square prediction errors (RMSE’s) are calculated in quasi-realtime recursive out-of-sample data – basically what I would call “backcasts.” In other words, the historic data is divided into training and test samples. The models are estimated on the various possible training samples (involving, in this case, consecutive data) and forecasts from these estimated models are matched against the out-of-sample or test data.

The study suggests four principles.

1. Subjective forecasts do the best
2. Good forecasts must account for a slowly varying local mean.
3. The Nowcast is important and typically utilizes different techniques than standard forecasting
4.  Heavy shrinkage in the use of information improves inflation forecasts

Interestingly, this study finds that judgmental forecasts (private sector surveys and the Greenbook) are remarkably hard to beat. Otherwise, most of the forecasting models fail to consistently trump a “naïve forecast” which is the average inflation rate over four previous periods.

What This Means

I’m willing to offer interpretations of these findings in terms of (a) the resilience of random walk models, and (b) the eclipse of unionized labor in the US.

So forecasting inflation as an average of several previous values suggests the underlying stochastic process is some type of random walk. Thus, the optimal forecast for a simple random walk is the most currently observed value. The optimal forecast for a random walk with noise is an exponentially weighted average of the past values of the series.

The random walk is a recurring theme in many macroeconomic forecasting contexts. It’s hard to beat.

As far as the Phillips curve goes, it’s not clear to me that the same types of tradeoffs between inflation and unemployment exist in the contemporary US economy, as did, say, in the 1950’s or 1960’s. The difference, I would guess, is the lower membership in and weaker power of unions. After the 1980’s, things began to change significantly on the labor front. Companies exacted concessions from unions, holding out the risk that the manufacturing operation might be moved abroad to a lower wage area, for instance. And manufacturing employment, the core of the old union power, fell precipitously.

As far as the potency of subjective forecasts – I’ll let Faust and Wright handle that. While these researchers find what they call subjective forecasts beat almost all the formal modeling approaches, I’ve seen other evaluations calling into question whether any inflation forecast beats a random walk approach consistently. I’ll have to dig out the references to make this stick.