Predicting the S&P 500 or the SPY Exchange-Traded Fund

By some lights, predicting the stock market is the ultimate challenge. Tremendous resources are dedicated to it – pundits on TV, specialized trading programs, PhD’s doing high-end quantitative analysis in hedge funds. And then, of course, theories of “rational expectations” and “efficient markets” deny the possibility of any consistent success at stock market prediction, on grounds that stock prices are basically random walks.

I personally have not dabbled much in forecasting the market, until about two months ago, when I grabbed a bunch of data on the S&P 500 and tried some regressions with lags on S&P 500 daily returns and daily returns from the VIX volatility index.

What I discovered is completely replicable, and also, so far as I can see, is not widely known.

An autoregressive time series model of S&P 500 or SPY daily returns, built with data from 1993 to early 2008, can outperform a Buy & Hold strategy initiated with out-of-sample data beginning January 2008 and carrying through to recent days.

Here is a comparison of cumulative gains from a Buy & Hold strategy initiated January 23, 2008 with a Trading Strategy informed by my autoregressive (AR) model.


So, reading this chart, investing $1000 January 23, 2008 and not touching this investment leads to cumulative returns of $1586.84 – that’s the Buy & Hold strategy.

The AR trading model, however, generates cumulative returns over this period of $2097.

The trading program based on the autoregressive model I am presenting here works like this. The AR model predicts the next day return for the SPY, based on the model coefficients (which I detail below) and the daily returns through the current day. So, if there is an element of unrealism, it is because the model is based on the daily returns computed on closing values day-by-day. But, obviously, you have to trade before the closing bell (in standard trading), so you need to use a estimate of the current day’s closing value obtained very close to the bell, before deciding whether to invest, sell, or buy SPY for the next day’s action.

But basically, assuming we can do this, perhaps seconds before the bell, and come close to an estimate of the current day closing price – the AR trading program is to buy SPY if the next day’s return is predicted to be positive – or if you currently hold SPY, to continue holding it. If the next day’s return is predicted to be negative, you sell your holdings.

It’s as simple as that.

So the AR model predicts daily returns on a one-day-ahead basis, using information on daily returns through the current trading day, plus the model coefficients.

Speaking of which, here are the coefficients from the Matlab “printout.”


There are a couple of nuances here. First, these parameter values do not derive from an ordinary least squares (OLS) regression. Instead, they are produced by maximum likelihood estimation, assuming the underlying distribution is a t-distribution (not a Gaussian distribution).

The use of a t-distribution, the idea of which I got to some extent from Nassim Taleb’s new text-in-progress mentioned two posts ago, is motivated by the unusual distribution of residuals of an OLS regression of lagged daily returns.

The proof is in the pudding here, too, since the above coefficients work better than ones developed on the (manifestly incorrect) assumption that the underlying error distribution is Gaussian.

Here is a graph of the 30-day moving averages of the proportion of signs of daily returns correctly predicted by this model.


Overall, about 53 percent of the signs of the daily returns in this out-of-sample period are predicted correctly.

If you look at this graph, too, it’s clear there are some differences in performance over this period. Thus, the accuracy of the model took a dive in 2009, in the depths of the Great Recession. And, model performance achieved significantly higher success proportions in 2012 and early 2013, perhaps related to markets getting used to money being poured in by the Fed’s policies of quantitative easing.

Why This AR Model is Such a Big Deal

I find it surprising that a set of fixed coefficients applied to the past 30 values of the SPY daily returns continue to predict effectively, months and years after the end of the in-sample values.

And, I might add, it’s not clear that updating the AR model always improves the outcomes, although I can do more work on this and also on the optimal sample period generally.

Can this be a matter of pure chance? This has to be considered, but I don’t think so. Monte Carlo simulations of randomized trading indicate that there is a 95 percent chance or better than returns of $2097 in this period are not due to chance. In other words, if I decide to trade on a day based on a flip of a fair coin, heads I buy, tails I sell at the end of the day, it’s highly unlikely I will generate cumulative returns of $2097, given the SPY returns over this period.

The performance of this trading model holds up fairly well through December of last year, but degrades some in the first days of 2014.

I think this is a feather in the cap of forecasting, so to speak. Also, it seems to me that economists promoting ideas of market efficiency and rational expectations need to take these findings into account. Everything is extant. I have provided the coefficients. You can get the SPY daily return values from Yahoo Finance. You can calculate everything yourself to check. I’ve done this several times, slightly differently each time. This time I used Matlab, and its arima estimation procedures work well.

I’m not quite sure what to make of all this, but I think it’s important. Naturally, I am extending these results in my personal model-building, and I can report that extensions are possible. At the same time, no extension of this model I have seen achieves more than nearly 60 percent accuracy in predicting the direction of change or sign of the daily returns, so you are going to lose money sometimes applying these models. Day-trading is a risky business.

4 thoughts on “Predicting the S&P 500 or the SPY Exchange-Traded Fund”

  1. It seems to me that in a rising market these returns are possible, but that you may want to test an indicator that suggest going short at certain times. It would be counterbalance to going long/hold/sell only strategy in a declining market.

  2. One comment is that you have a lot of lags (implying system memory going back over a month with daily data excluding weekends), and many of those lags have very low t-stats (relating to significance). In fact, most of them are marginally significant (and imply real-world things about the memory of past behavior in the stock market that may be hard to justify). If a 7-lag model performs about as well as a 30-lag model, you should ALWAYS use the 7-lag model as it will be more generally robust. Exposing more variables to a model invariably improves back-fit but far more often than not harms predictive accuracy and model robustness (generalizability under broader data conditions) by introduction of parametric co-linearity – a property that is not widely understood.

    [And as an aside - even with proper out-of-sample analysis you are still exposed to the risk inherent to the problem of induction so frequently invoked by Dr. Taleb. A proper out-of-sample analysis can tell you about your "current model error", but not necessarily about "future model error". Understanding the difference and the inherent risks in employing predictive models is VERY important before putting your money on the line]

    I would be very interested to see this exercise repeated with several lag models (maybe 7-, a 14-, and a 28-lag variants) with a strict “out of sample” policy (the ML estimation leverages only the first half of the data, the prediction measurement uses only the second).

    This approach also assumes stationarity of the coefficients (the lag coefficients are fixed over all time). You can probably do even better still with an adaptive model if behavior varies slowly over time (the Kalman filter is one possible approach, although it relies very heavily on estimation of observational volatility as well as assumptions about the “degree of non-stationarity” inherent to the system behavior, and so exercise great caution with regard to all of the statements about using historical volatility, normal assumptions, rare events, etc.). An adaptive AR model has the advantage of incorporating more recent data progressively as time proceeds. Even if the stock market is model-able as an AR(n) process, it’s surely not stationary (and in fact is often used in foundational statistics classes as an example of a non-stationary process). Over the long haul an adaptive approach will almost surely perform better. If you want to approximately simulate this, generate the lag coefficients using only say the two years preceding years and progressively update them on every N prediction steps (maybe once a month for simplicity). You would of course maintain the “blind holdout” policy in so doing.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>