The NASDAQ 100 Daily Returns and Laplace Distributed Errors

I once ran into Norman Mailer at the Museum of Modern Art in Manhattan. We were both looking at Picasso’s “Blue Boy” and, recognizing him, I started up some kind of conversation, and Mailer was quite civil about the whole thing.

I mention this because I always associate Mailer with his collection Advertisements for Myself.

And that segues – loosely – into my wish to let you know that, in fact, I developed a generalization of the law of demand for the situation in which a commodity is sold at a schedule of rates and fees, instead of a uniform price. That was in 1987, when I was still a struggling academic and beginning a career in business consulting.

OK, and that relates to a point I want to suggest here. And that is that minor players can have big ideas.

So I recognize an element of “hubris” in suggesting that the error process of S&P 500 daily returns – up to certain transformations – is described by a Laplace distribution.

What about other stock market indexes, then? This morning, I woke up and wondered whether the same thing is true for, say, the NASDAQ 100.

So I downloaded daily closing prices for the NASDAQ 100 from Yahoo Finance dating back to October 1, 1985. Then, I took the natural log of each of these closing prices. After that, I took trading day by trading day differences. So the series I am analyzing comes from the first differences of the natural log of the NASDAQ 100 daily closing prices.

Note that this series of first differences is sometimes cast into a histogram by itself – and this also frequently is a “pointy peaked” relatively symmetric distribution. You could motivate this graph with the idea that stock prices are a random walk. So if you take first differences, you get the random component that generates the random walk.

I am troubled, however, by the fact that this component has considerable structure in and of itself. So I undertake further analysis.

For example, the autocorrelation function of these first differences of the log of NASDAQ 100 daily closing prices looks like this.

Now if you calculate bivariate regressions on these first differences and their lagged values, many of them produce coefficient estimates with t-statistics that exceed the magic value of 2.

Just selecting these significant regressors from the first 47 lags produces this regression equation, I get this equation.

Now this regression is estimated over all 7200 observations from October 1 1984 to almost right now.

Graphing the residuals, I get the familiar pointy-peaked distribution that we saw with the S&P 500.

Here is a fit of the Laplace distribution to this curve (Again using EasyFit).

Here are the metrics for this fit and fits to a number of other probability distributions from this program.

I have never seen as clear a linkage of returns from stock indexes and the Laplace distribution (maybe with a slight asymmetry – there are also asymmetric Laplace distributions).

One thing is for sure – the distribution above for the NASDAQ 100 data and the earlier distribution developed for the S&P 500 are not close to be normally distributed. Thus, in the table above that the normal distribution is number 12 on the list of possible candidates identified by EasyFit.

Note “Error” listed in the above table, is not the error function related to the normal distribution. Instead it is another exponential distribution with an absolute value in the exponent like the Laplace distribution. In fact, it looks like a transformation of the Laplace, but I need to do further investigation. In any case, it’s listed as number 2, even though the metrics show the same numbers.

The plot thickens.

Obviously, the next step is to investigate individual stocks with respect to Laplacian errors in this type of transformation.

Also, some people will be interested in whether the autoregressive relationship listed above makes money under the right trading rules. I will report further on that.

Anyway, thanks for your attention. If you have gotten this far – you believe numbers have power. Or you maybe are interested in finance and realize that indirect approaches may be the best shot at getting to something fundamental.

Daily Updates on Whether Key Financial Series Are Going Into Bubble Mode

Financial and asset bubbles are controversial, amazingly enough, in standard economics, where a bubble is defined as a divergence in a market from fundamental value. The problem, of course, is what is fundamental value. Maybe investors in the dot.com frenzy of the late 1990’s believed all the hype about never-ending and accelerating growth in IT, as a result of the Internet.

So we have this chart for the ETF SPY which tracks the S&P500. Now, there are similarities between the upswing of the two previous peaks – which both led to busts – and the current surge in the index.

Where is this going to end?

Well, I’ve followed the research of Didier Sornette and his co-researchers, and, of course, Sornette’s group has an answer to this question, which is “probably not well.” Currently, Professor Sornette occupies the Chair of Entreprenuerial Risk at the Swiss Federal Institute of Technology in Zurich.

There is an excellent website maintained by ETH Zurich for the theory and empirical analysis of financial bubbles.

Sornette and his group view bubbles from a more mathematical perspective, finding similarities in bubbles of durations from months to years in the concept of “faster than exponential growth.” At some point, that is, asset prices embark on this type of trajectory. Because of various feedback mechanisms in financial markets, as well as just herding behavior, asset prices in bubble mode oscillate around an accelerating trajectory which – at some point that Sornette claims can be identified mathematically – becomes unsupportable. At such a moment, there is a critical point where the probability of a collapse or reversal of the process becomes significantly greater.

This group is on the path of developing a new science of asset bubbles, if you will.

And, by this logic, there are positive and negative bubbles.

The sharp drop in stock prices in 2008, for example, represents a negative stock market bubble movement, and also is governed or described, by this theory, by an underlying differential equation. This differential equation leads to critical points, where the probability of reversal of the downward price movement is significantly greater.

I have decided I am going to compute the full price equation suggested by Sornette and others to see what prediction for a critical point emerges for the S&P 500 or SPY.

But actually, this would be for my own satisfaction, since Sornette’s group already is doing this in the Financial Crisis Observatory.

I hope I am not violating Swiss copyright rules by showing the following image of the current Financial Crisis Observatory page (click to enlarge)

As you notice there are World Markets, Commodities, US Sectors, US Large Cap categories and little red and green boxes scattered across the page, by date.

The red boxes indicate computations by the ETH Zurich group that indicate the financial series in question is going into bubble mode. This is meant as a probabilistic evaluation and is accompanied by metrics which indicate the likelihood of a critical point. These computations are revised daily, according to the site.

For example, there is a red box associated with the S&P 500 in late May. If you click on this red box, you  produces the following chart.

The implication is that the highest red spike in the chart at the end of December 2013 is associated with a reversal in the index, and also that one would be well-advised to watch for another similar spike coming up.

Negative bubbles, as I mention, also are in the lexicon. One of the green boxes for gold, for example, produces the following chart.

This is fascinating stuff, and although Professor Sornette has gotten some media coverage over the years, even giving a TED talk recently, the economics profession generally seems to have given him almost no attention.

I plan a post on this approach with a worked example. It certainly is much more robust that some other officially sanctioned approaches.

Trend Following in the Stock Market

Noah Smith highlights some amazing research on investor attitudes and behavior in Does trend-chasing explain financial markets?

He cites 2012 research by Greenwood and Schleifer where these researchers consider correlations between investor expectations, as measured by actual investor surveys, and subsequent investor behavior.

A key graphic is the following:

This graph shows rather amazingly, as Smith points out..when people say they expect stocks to do well, they actually put money into stocks. How do you find out what investor expectations are? – You ask them – then it’s interesting it’s possible to show that for the most part they follow up attitudes with action.

This discussion caught my eye since Sornette and others attribute the emergence of bubbles to momentum investing or trend-following behavior. Sometimes Sornette reduces this to “herding” or mimicry. I think there are simulation models, combining trend investors with others following a market strategy based on “fundamentals”, which exhibit cumulating and collapsing bubbles.

More on that later, when I track all that down.

For the moment, some research put out by AQR Capital Management in Greenwich CT makes big claims for an investment strategy based on trend following –

The most basic trend-following strategy is time series momentum – going long markets with recent positive returns and shorting those with recent negative returns. Time series momentum has been profitable on average since 1985 for nearly all equity index futures, fixed income futures, commodity futures, and currency forwards. The strategy explains the strong performance of Managed Futures funds from the late 1980s, when fund returns and index data first becomes available.

This paragraph references research by Moscowitz and Pederson published in the Journal of Financial Economics – an article called Time Series Momentum.

But more spectacularly, this AQR white paper presents this table of results for a trend-following investment strategy decade-by-decade.

There are caveats to this rather earth-shaking finding, but what it really amounts to for many investors is a recommendation to look into managed futures.

Along those lines there is this video interview, conducted in 2013, with Brian Hurst, one of the authors of the AQR white paper. He reports that recently trending-following investing has run up against “choppy” markets, but holds out hope for the longer term –

At the same time, caveat emptor. Bloomberg reported late last year that a lot of investors plunging into managed futures after the Great Recession of 2008-2009 have been disappointed, in many cases, because of the high, unregulated fees and commissions involved in this type of alternative investment.

Looking ahead, I’m almost sure I want to explore forecasting in the medical field this coming week. Menzie Chin at Econbrowser, for example, highlights forecasts that suggest states opting out of expanded Medicare are flirting with higher death rates. This sets off a flurry of comments, highlighting the importance and controversy attached to various forecasts in the field of medical practice.

There’s a lot more – from bizarre and sad mortality trends among Russian men since the collapse of the Soviet Union, now stabilizing to an extent, to systems which forecast epidemics, to, again, cost and utilization forecasts.

Today, however, I want to wind up this phase of posts on forecasting the stock and related financial asset markets.

Market Expectations in the Cross Section of Present Values

That’s the title of Bryan Kelly and Seth Pruitt’s article in the Journal of Finance, downloadable from the Social Science Research Network (SSRN).

The following chart from this paper shows in-sample (IS) and out-of-sample (OOS) performance of Kelly and Pruitt’s new partial least squares (PLS) predictor, and IS and OOS forecasts from another model based on the aggregate book-to-market ratio. (Click to enlarge)

The Kelly-Pruitt PLS predictor is much better in both in-sample and out-of-sample than the more traditional regression model based on aggregate book-t0-market ratios.

What Kelly and Pruitt do is use what I would call cross-sectional time series data to estimate aggregate market returns.

Basically, they construct a single factor which they use to predict aggregate market returns from cross-sections of portfolio-level book-to-market ratios.

So,

To harness disaggregated information we represent the cross section of asset-specific book-to-market ratios as a dynamic latent factor model. We relate these disaggregated value ratios to aggregate expected market returns and cash flow growth. Our model highlights the idea that the same dynamic state variables driving aggregate expectations also govern the dynamics of the entire panel of asset-specific valuation ratios. This representation allows us to exploit rich cross-sectional information to extract precise estimates of market expectations.

This cross-sectional data presents a “many predictors” type of estimation problem, and the authors write that,

Our solution is to use partial least squares (PLS, Wold (1975)), which is a simple regression-based procedure designed to parsimoniously forecast a single time series using a large panel of predictors. We use it to construct a univariate forecaster for market returns (or dividend growth) that is a linear combination of assets’ valuation ratios. The weight of each asset in this linear combination is based on the covariance of its value ratio with the forecast target.

I think it is important to add that the authors extensively explore PLS as a procedure which can be considered to be built from a series of cross-cutting regressions, as it were (See their white paper on three-pass regression filter).

But, it must be added, this PLS procedure can be summarized in a single matrix formula, which is

Readers wanting definitions of these matrices should consult the Journal of Finance article and/or the white paper mentioned above.

The Kelly-Pruitt analysis works where other methods essentially fail – in OOS prediction,

Using data from 1930-2010, PLS forecasts based on the cross section of portfolio-level book-to-market ratios achieve an out-of-sample predictive R2 as high as 13.1% for annual market returns and 0.9% for monthly returns (in-sample R2 of 18.1% and 2.4%, respectively). Since we construct a single factor from the cross section, our results can be directly compared with univariate forecasts from the many alternative predictors that have been considered in the literature. In contrast to our results, previously studied predictors typically perform well in-sample but become insignifcant out-of-sample, often performing worse than forecasts based on the historical mean return …

So, the bottom line is that aggregate stock market returns are predictable from a common-sense perspective, without recourse to abstruse error measures. And I believe Amit Goyal, whose earlier article with Welch contests market predictability, now agrees (personal communication) that this application of a PLS estimator breaks new ground out-of-sample – even though its complexity asks quite a bit from the data.

Note, though, how volatile aggregate realized returns for the US stock market are, and how forecast errors of the Kelly-Pruitt analysis become huge during the 2008-2009 recession and some previous recessions – indicated by the shaded lines in the above figure.

Still something is better than nothing, and I look for improvements to this approach – which already has been applied to international stocks by Kelly and Pruitt and other slices portfolio data.

Predicting the Market Over Short Time Horizons

Google “average time a stock is held.” You will come up with figures that typically run around 20 seconds. High frequency trades (HFT) dominate trading volume on the US exchanges.

All of which suggests the focus on the predictability of stock returns needs to position more on intervals lasting seconds or minutes, rather than daily, monthly, or longer trading periods.

So, it’s logical that Michael Rechenthin, a newly minted Iowa Ph.D., and Nick Street, a Professor of Management, are getting media face time from research which purportedly demonstrates the existence of predictable short-term trends in the market (see Using conditional probability to identify trends in intra-day high-frequency equity pricing).

Here’s the abstract –

By examining the conditional probabilities of price movements in a popular US stock over different high-frequency intra-day timespans, varying levels of trend predictability are identified. This study demonstrates the existence of predictable short-term trends in the market; understanding the probability of price movement can be useful to high-frequency traders. Price movement was examined in trade-by-trade (tick) data along with temporal timespans between 1 s to 30 min for 52 one-week periods for one highly-traded stock. We hypothesize that much of the initial predictability of trade-by-trade (tick) data is due to traditional market dynamics, or the bouncing of the price between the stock’s bid and ask. Only after timespans of between 5 to 10 s does this cease to explain the predictability; after this timespan, two consecutive movements in the same direction occur with higher probability than that of movements in the opposite direction. This pattern holds up to a one-minute interval, after which the strength of the pattern weakens.

The study examined price movements of the exchange traded fund SPY, during 2005, finding that

Of course, the challenges of generalization in this world of seconds and minutes is tremendous. Perhaps, for example, the patterns the authors identify are confined to the year of the study. Without any theoretical basis, brute force generalization means riffling through additional years of 31.5 million seconds each.

Then, there are the milliseconds, and the recent blockbuster written by Michael Lewis – Flash Boys: A Wall Street Revolt.

I’m on track for reading this book for a bookclub to which I belong.

As I understand it, Lewis, who is one of my favorite financial writers, has uncovered a story whereby high frequency traders, operating with optical fiber connections to the New York Stock Exchange, sometimes being geographically as proximate as possible, can exploit more conventional trading – basically buying a stock after you have put in a buy order, but before your transaction closes, thus raising your price if you made a market order.

The LA Times  has a nice review of the book and ran the above photo of Lewis.

Stock Market Predictability – Controversy

In the previous post, I drew from papers by Neeley, who is Vice President of the Federal Reserve Bank of St. Louis, David Rapach at St. Louis University and Goufu Zhou at Washington University in St. Louis.

These authors contribute two papers on the predictability of equity returns.

The earlier one – Forecasting the Equity Risk Premium: The Role of Technical Indicators – is coming out in Management Science. Of course, the survey article – Forecasting the Equity Risk Premium: The Role of Technical Indicators – is a chapter in the recent volume 2 of the Handbook of Forecasting.

I go through this rather laborious set of citations because it turns out that there is an underlying paper which provides the data for the research of these authors, but which comes to precisely the opposite conclusion –

The goal of our own article is to comprehensively re-examine the empirical evidence as of early 2006, evaluating each variable using the same methods (mostly, but not only, in linear models), time-periods, and estimation frequencies. The evidence suggests that most models are unstable or even spurious. Most models are no longer significant even insample (IS), and the few models that still are usually fail simple regression diagnostics.Most models have performed poorly for over 30 years IS. For many models, any earlier apparent statistical significance was often based exclusively on years up to and especially on the years of the Oil Shock of 1973–1975. Most models have poor out-of-sample (OOS) performance, but not in a way that merely suggests lower power than IS tests. They predict poorly late in the sample, not early in the sample. (For many variables, we have difficulty finding robust statistical significance even when they are examined only during their most favorable contiguous OOS sub-period.) Finally, the OOS performance is not only a useful model diagnostic for the IS regressions but also interesting in itself for an investor who had sought to use these models for market-timing. Our evidence suggests that the models would not have helped such an investor. Therefore, although it is possible to search for, to occasionally stumble upon, and then to defend some seemingly statistically significant models, we interpret our results to suggest that a healthy skepticism is appropriate when it comes to predicting the equity premium, at least as of early 2006. The models do not seem robust.

This is from Ivo Welch and Amit Goyal’s 2008 article A Comprehensive Look at The Empirical Performance of Equity Premium Prediction in the Review of Financial Studies which apparently won an award from that journal as the best paper for the year.

And, very importantly, the data for this whole discussion is available, with updates, from Amit Goyal’s site now at the University of Lausanne.

Where This Is Going

Currently, for me, this seems like a genuine controversy in the forecasting literature. And, as an aside, in writing this blog I’ve entertained the notion that maybe I am on the edge of a new form of or focus in journalism – namely stories about forecasting controversies. It’s kind of wonkish, but the issues can be really, really important.

I also have a “hands-on” philosophy, when it comes to this sort of information. I much rather explore actual data and run my own estimates, than pick through theoretical arguments.

So anyway, given that Goyal generously provides updated versions of the data series he and Welch originally used in their Review of Financial Studies article, there should be some opportunity to check this whole matter. After all, the estimation issues are not very difficult, insofar as the first level of argument relates primarily to the efficacy of simple bivariate regressions.

By the way, it’s really cool data.

Here is the book-to-market ratio, dating back to 1926.

But beyond these simple regressions that form a large part of the argument, there is another claim made by Neeley, Rapach, and Zhou which I take very seriously. And this is that – while a “kitchen sink” model with all, say, fourteen so-called macroeconomic variables does not outperform the benchmark, a principal components regression does.

This sounds really plausible.

Anyway, if readers have flagged updates to this controversy about the predictability of stock market returns, let me know. In addition to grubbing around with the data, I am searching for additional analysis of this point.

Evidence of Stock Market Predictability

In business forecast applications, I often have been asked, “why don’t you forecast the stock market?” It’s almost a variant of “if you’re so smart, why aren’t you rich?” I usually respond something about stock prices being largely random walks.

But, stock market predictability is really the nut kernel of forecasting, isn’t it?

Earlier this year, I looked at the S&P 500 index and the SPY ETF numbers, and found I could beat a buy and hold strategy with a regression forecasting model. This was an autoregressive model with lots of lagged values of daily S&P returns. In some variants, it included lagged values of the Chicago Board of Trade VIX volatility index returns. My portfolio gains were compiled over an out-of-sample (OS) period. This means, of course, that I estimated the predictive regression on historical data that preceded and did not include the OS or test data.

Well, today I’m here to report to you that it looks like it is officially possible to achieve some predictability of stock market returns in out-of-sample data.

One authoritative source is Forecasting Stock Returns, an outstanding review by Rapach and Zhou  in the recent, second volume of the Handbook of Economic Forecasting.

The story is fascinating.

For one thing, most of the successful models achieve their best performance – in terms of beating market averages or other common benchmarks – during recessions.

And it appears that technical market indicators, such as the oscillators, momentum, and volume metrics so common in stock trading sites, have predictive value. So do a range of macroeconomic indicators.

But these two classes of predictors – technical market and macroeconomic indicators – are roughly complementary in their performance through the business cycle. As Christopher Neeley et al detail in Forecasting the Equity Risk Premium: The Role of Technical Indicators,

Macroeconomic variables typically fail to detect the decline in the actual equity risk premium early in recessions, but generally do detect the increase in the actual equity risk premium late in recessions. Technical indicators exhibit the opposite pattern: they pick up the decline in the actual premium early in recessions, but fail to match the unusually high premium late in recessions.

Stock Market Predictors – Macroeconomic and Technical Indicators

Rapach and Zhou highlight fourteen macroeconomic predictors popular in the finance literature.

1. Log dividend-price ratio (DP): log of a 12-month moving sum of dividends paid on the S&P 500 index minus the log of stock prices (S&P 500 index).

2. Log dividend yield (DY): log of a 12-month moving sum of dividends minus the log of lagged stock prices.

3. Log earnings-price ratio (EP): log of a 12-month moving sum of earnings on the S&P 500 index minus the log of stock prices.

4. Log dividend-payout ratio (DE): log of a 12-month moving sum of dividends minus the log of a 12-month moving sum of earnings.

5. Stock variance (SVAR): monthly sum of squared daily returns on the S&P 500 index.

6. Book-to-market ratio (BM): book-to-market value ratio for the DJIA.

7. Net equity expansion (NTIS): ratio of a 12-month moving sum of net equity issues by NYSE-listed stocks to the total end-of-year market capitalization of NYSE stocks.

8. Treasury bill rate (TBL): interest rate on a three-month Treasury bill (secondary market).

9. Long-term yield (LTY): long-term government bond yield.

10. Long-term return (LTR): return on long-term government bonds.

11. Term spread (TMS): long-term yield minus the Treasury bill rate.

12. Default yield spread (DFY): difference between BAA- and AAA-rated corporate bond yields.

13. Default return spread (DFR): long-term corporate bond return minus the long-term government bond return.

14. Inflation (INFL): calculated from the CPI (all urban consumers

In addition, there are technical indicators, which are generally moving average, momentum, or volume-based.

The moving average indicators typically provide a buy or sell signal based on a comparing two moving averages – a short and a long period MA.

Momentum based rules are based on the time trajectory of prices. A current stock price higher than its level some number of periods ago indicates “positive” momentum and expected excess returns, and generates a buy signal.

Momentum rules can be combined with information about the volume of stock purchases, such as Granville’s on-balance volume.

Each of these predictors can be mapped onto equity premium excess returns – measured by the rate of return on the S&P 500 index net of return on a risk-free asset. This mapping is a simple bi-variate regression with equity returns from time t on the left side of the equation and the economic predictor lagged by one time period on the right side of the equation. Monthly data are used from 1927 to 2008. The out-of-sample (OS) period is extensive, dating from the 1950’s, and includes most of the post-war recessions.

The following table shows what the authors call out-of-sample (OS) R2 for the 14 so-called macroeconomic variables, based on a table in the Handbook of Forecasting chapter. The OS R2 is equal to 1 minus a ratio. This ratio has the mean square forecast error (MSFE) of the predictor forecast in the numerator and the MSFE of the forecast based on historic average equity returns in the denominator. So if the economic indicator functions to improve the OS forecast of equity returns, the OS R2 is positive. If, on the other hand, the historic average trumps the economic indicator forecast, the OS R2 is negative.

(click to enlarge).

Overall, most of the macro predictors in this list don’t make it.  Thus, 12 of the 14 OS R2 statistics are negative in the second column of the Table, indicating that the predictive regression forecast has a higher MSFE than the historical average.

For two of the predictors with a positive out-of-sample R2, the p-values reported in the brackets are greater than 0.10, so that these predictors do not display statistically significant out-of-sample performance at conventional levels.

Thus, the first two columns in this table, under “Overall”, support a skeptical view of the predictability of equity returns.

However, during recessions, the situation is different.

For several the predictors, the R2 OS statistics move from being negative (and typically below -1%) during expansions to 1% or above during recessions. Furthermore, some of these R2 OS statistics are significant at conventional levels during recessions according to the  p-values, despite the decreased number of available observations.

Now imposing restrictions on the regression coefficients substantially improves this forecast performance, as the lower panel (not shown) in this table shows.

Rapach and Zhou were coauthors of the study with Neeley, published earlier as a working paper with the St. Louis Federal Reserve.

This working paper is where we get the interesting report about how technical factors add to the predictability of equity returns (again, click to enlarge).

This table has the same headings for the columns as Table 3 above.

It shows out-of-sample forecasting results for several technical indicators, using basically the same dataset, for the overall OS period, for expansions, and recessions in this period dating from the 1950’s to 2008.

In fact, these technical indicators generally seem to do better than the 14 macroeconomic indicators.

Low OS R2

Even when these models perform their best, their increase in mean square forecast error (MSFE) is only slightly more than the MSFE of the benchmark historic average return estimate.

This improved performance, however, can still achieve portfolio gains for investors, based on various trading rules, and, as both papers point out, investors can use the information in these forecasts to balance their portfolios, even when the underlying forecast equations are not statistically significant by conventional standards. Interesting argument, and I need to review it further to fully understand it.

In any case, my experience with an autoregressive model for the S&P 500 is that trading rules can be devised which produce portfolio gains over a buy and hold strategy, even when the Ris on the order of 1 or a few percent. All you have to do is correctly predict the sign of the return on the following trading day, for instance, and doing this a little more than 50 percent of the time produces profits.

Rapach and Zhou, in fact, develop insights into how predictability of stock returns can be consistent with rational expectations – providing the relevant improvements in predictability are bounded to be low enough.

Some Thoughts

But, for the time being, I have one question.

The is why econometricians of the caliber of Rapach, Zhou, and Neeley persist in relying on tests of statistical significance which are predicated, in a strict sense, on the normality of the residuals of these financial return regressions.

I’ve looked at this some, and it seems the t-statistic is somewhat robust to violations of normality of the underlying error distribution of the regression. However, residuals of a regression on equity rates of return can be very non-normal with fat tails and generally some skewness. I keep wondering whether anyone has really looked at how this translates into tests of statistical significance, or whether what we see on this topic is mostly arm-waving.

For my money, OS predictive performance is the key criterion.

The Worst Bear Market in History – Guest Post

This is a fascinating case study of financial aberration, authored by Bryan Taylor, Ph.D., Chief Economist, Global Financial Data.

**********************************************************

Which country has the dubious distinction of suffering the worst bear market in history?

To answer this question, we ignore countries where the government closed down the stock exchange, leaving investors with nothing, as occurred in Russia in 1917 or Eastern European countries after World War II. We focus on stock markets that continued to operate during their equity-destroying disaster.

There is a lot of competition in this category.  Almost every major country has had a bear market in which share prices have dropped over 80%, and some countries have had drops of over 90%. The Dow Jones Industrial Average dropped 89% between 1929 and 1932, the Greek Stock market fell 92.5% between 1999 and 2012, and adjusted for inflation, Germany’s stock market fell over 97% between 1918 and 1922.

The only consolation to investors is that the maximum loss on their investment is 100%, and one country almost achieved that dubious distinction. Cyprus holds the record for the worst bear market of all time in which investors have lost over 99% of their investment! Remember, this loss isn’t for one stock, but for all the shares listed on the stock exchange.

The Cyprus Stock Exchange All Share Index hit a high of 11443 on November 29, 1999, fell to 938 by October 25, 2004, a 91.8% drop.  The index then rallied back to 5518 by October 31, 2007 before dropping to 691 on March 6, 2009.  Another rally ensued to October 20, 2009 when the index hit 2100, but collapsed from there to 91 on October 24, 2013.  The chart below makes any roller-coaster ride look boring by comparison (click to enlarge).

The fall from 11443 to 91 means that someone who invested at the top in 1999 would have lost 99.2% of their investment by 2013.  And remember, this is for ALL the shares listed on the Cyprus Stock Exchange.  By definition, some companies underperform the average and have done even worse, losing their shareholders everything.

For the people in Cyprus, this achievement only adds insult to injury.  One year ago, in March 2013, Cyprus became the fifth Euro country to have its financial system rescued by a bail-out.  At its height, the banking system’s assets were nine times the island’s GDP. As was the case in Iceland, that situation was unsustainable.

Since Germany and other paymasters for Ireland, Portugal, Spain and Greece were tired of pouring money down the bail-out drain, they demanded not only the usual austerity and reforms to put the country on the right track, but they also imposed demands on the depositors of the banks that had created the crisis, creating a “bail-in”.

As a result of the bail-in, debt holders and uninsured depositors had to absorb bank losses. Although some deposits were converted into equity, given the decline in the stock market, this provided little consolation. Banks were closed for two weeks and capital controls were imposed upon Cyprus.  Not only did depositors who had money in banks beyond the insured limit lose money, but depositors who had money in banks were restricted from withdrawing their funds. The impact on the economy has been devastating. GDP has declined by 12%, and unemployment has gone from 4% to 17%.

On the positive side, when Cyprus finally does bounce back, large profits could be made by investors and speculators.  The Cyprus SE All-Share Index is up 50% so far in 2014, and could move up further. Of course, there is no guarantee that the October 2013 will be the final low in the island’s fourteen-year bear market.  To coin a phrase, Cyprus is a nice place to visit, but you wouldn’t want to invest there.

Three Pass Regression Filter – New Data Reduction Method

Malcolm Gladwell’s 10,000 hour rule (for cognitive mastery) is sort of an inspiration for me. I picked forecasting as my field for “cognitive mastery,” as dubious as that might be. When I am directly engaged in an assignment, at some point or other, I feel the need for immersion in the data and in estimations of all types. This blog, on the other hand, represents an effort to survey and, to some extent, get control of new “tools” – at least in a first pass. Then, when I have problems at hand, I can try some of these new techniques.

Ok, so these remarks preface what you might call the humility of my approach to new methods currently being innovated. I am not putting myself on a level with the innovators, for example. At the same time, it’s important to retain perspective and not drop a critical stance.

The Working Paper and Article in the Journal of Finance

Probably one of the most widely-cited recent working papers is Kelly and Pruitt’s three pass regression filter (3PRF). The authors, shown above, are with the University of Chicago, Booth School of Business and the Federal Reserve Board of Governors, respectively, and judging from the extensive revisions to the 2011 version, they had a bit of trouble getting this one out of the skunk works.

Recently, however, Kelly and Pruit published an important article in the prestigious Journal of Finance called Market Expectations in the Cross-Section of Present Values. This article applies a version of the three pass regression filter to show that returns and cash flow growth for the aggregate U.S. stock market are highly and robustly predictable.

I learned of a published application of the 3PRF from Francis X. Dieblod’s blog, No Hesitations, where Diebold – one of the most published authorities on forecasting – writes

Recent interesting work, moreover, extends PLS in powerful ways, as with the Kelly-Pruitt three-pass regression filter and its amazing apparent success in predicting aggregate equity returns.

What is the 3PRF?

The working paper from the Booth School of Business cited at a couple of points above describes what might be cast as a generalization of partial least squares (PLS). Certainly, the focus in the 3PRF and PLS is on using latent variables to predict some target.

I’m not sure, though, whether 3PRF is, in fact, more of a heuristic, rather than an algorithm.

What I mean is that the three pass regression filter involves a procedure, described below.

(click to enlarge).

Here’s the basic idea –

Suppose you have a large number of potential regressors xi ε X, i=1,..,N. In fact, it may be impossible to calculate an OLS regression, since N > T the number of observations or time periods.

Furthermore, you have proxies zj ε  Z, I = 1,..,L – where L is significantly less than the number of observations T. These proxies could be the first several principal components of the data matrix, or underlying drivers which theory proposes for the situation. The authors even suggest an automatic procedure for generating proxies in the paper.

And, finally, there is the target variable yt which is a column vector with T observations.

Latent factors in a matrix F drive both the proxies in Z and the predictors in X. Based on macroeconomic research into dynamic factors, there might be only a few of these latent factors – just as typically only a few principal components account for the bulk of variation in a data matrix.

Now here is a key point – as Kelly and Pruitt present the 3PRF, it is a leading indicator approach when applied to forecasting macroeconomic variables such as GDP, inflation, or the like. Thus, the time index for yt ranges from 2,3,…T+1, while the time indices of all X and Z variables and the factors range from 1,2,..T. This means really that all the x and z variables are potentially leading indicators, since they map conditions from an earlier time onto values of a target variable at a subsequent time.

What Table 1 above tells us to do is –

1. Run an ordinary least square (OLS) regression of the xi      in X onto the zj in X, where T ranges from 1 to T and there are      N variables in X and L << T variables in Z. So, in the example      discussed below, we concoct a spreadsheet example with 3 variables in Z,      or three proxies, and 10 predictor variables xi in X (I could      have used 50, but I wanted to see whether the method worked with lower      dimensionality). The example assumes 40 periods, so t = 1,…,40. There will      be 40 different sets of coefficients of the zj as a result of      estimating these regressions with 40 matched constant terms.
2. OK, then we take this stack of estimates of      coefficients of the zj and their associated constants and map      them onto the cross sectional slices of X for t = 1,..,T. This means that,      at each period t, the values of the cross-section. xi,t, are      taken as the dependent variable, and the independent variables are the 40      sets of coefficients (plus constant) estimated in the previous step for      period t become the predictors.
3. Finally, we extract the estimate of the factor loadings      which results, and use these in a regression with target variable as the      dependent variable.

This is tricky, and I have questions about the symbolism in Kelly and Pruitt’s papers, but the procedure they describe does work. There is some Matlab code here alongside the reference to this paper in Professor Kelly’s research.

At the same time, all this can be short-circuited (if you have adequate data without a lot of missing values, apparently) by a single humungous formula –

Here, the source is the 2012 paper.

Spreadsheets help me understand the structure of the underlying data and the order of calculation, even if, for the most part, I work with toy examples.

So recently, I’ve been working through the 3PRF with a small spreadsheet.

Generating the factors:I generated the factors as two columns of random variables (=rand()) in Excel. I gave the factors different magnitudes by multiplying by different constants.

Generating the proxies Z and predictors X. Kelly and Pruitt call for the predictors to be variance standardized, so I generated 40 observations on ten sets of xi by selecting ten different coefficients to multiply into the two factors, and in each case I added a normal error term with mean zero and standard deviation 1. In Excel, this is the formula =norminv(rand(),0,1).

Basically, I did the same drill for the three zj — I created 40 observations for z1, z2, and z3 by multiplying three different sets of coefficients into the two factors and added a normal error term with zero mean and variance equal to 1.

Then, finally, I created yt by multiplying randomly selected coefficients times the factors.

After generating the data, the first pass regression is easy. You just develop a regression with each predictor xi as the dependent variable and the three proxies as the independent variables, case-by-case, across the time series for each. This gives you a bunch of regression coefficients which, in turn, become the explanatory variables in the cross-sectional regressions of the second step.

The regression coefficients I calculated for the three proxies, including a constant term, were as follows – where the 1st row indicates the regression for x1 and so forth.

This second step is a little tricky, but you just take all the values of the predictor variables for a particular period and designate these as the dependent variables, with the constant and coefficients estimated in the previous step as the independent variables. Note, the number of predictors pairs up exactly with the number of rows in the above coefficient matrix.

This then gives you the factor loadings for the third step, where you can actually predict yt (really yt+1 in the 3PRF setup). The only wrinkle is you don’t use the constant terms estimated in the second step, on the grounds that these reflect “idiosyncratic” effects, according to the 2011 revision of the paper.

Note the authors describe this as a time series approach, but do not indicate how to get around some of the classic pitfalls of regression in a time series context. Obviously, first differencing might be necessary for nonstationary time series like GDP, and other data massaging might be in order.

Bottom line – this worked well in my first implementation.

To forecast, I just used the last regression for yt+1 and then added ten more cases, calculating new values for the target variable with the new values of the factors. I used the new values of the predictors to update the second step estimate of factor loadings, and applied the last third pass regression to these values.

Here are the forecast errors for these ten out-of-sample cases.

Not bad for a first implementation.

Why Is Three Pass Regression Important?

3PRF is a fairly “clean” solution to an important problem, relating to the issue of “many predictors” in macroeconomics and other business research.

Noting that if the predictors number near or more than the number of observations, the standard ordinary least squares (OLS) forecaster is known to be poorly behaved or nonexistent, the authors write,

How, then, does one effectively use vast predictive information? A solution well known in the economics literature views the data as generated from a model in which latent factors drive the systematic variation of both the forecast target, y, and the matrix of predictors, X. In this setting, the best prediction of y is infeasible since the factors are unobserved. As a result, a factor estimation step is required. The literature’s benchmark method extracts factors that are significant drivers of variation in X and then uses these to forecast y. Our procedure springs from the idea that the factors that are relevant to y may be a strict subset of all the factors driving X. Our method, called the three-pass regression filter (3PRF), selectively identifies only the subset of factors that influence the forecast target while discarding factors that are irrelevant for the target but that may be pervasive among predictors. The 3PRF has the advantage of being expressed in closed form and virtually instantaneous to compute.

So, there are several advantages, such as (1) the solution can be expressed in closed form (in fact as one complicated but easily computable matrix expression), and (2) there is no need to employ maximum likelihood estimation.

Furthermore, 3PRF may outperform other approaches, such as principal components regression or partial least squares.

The paper illustrates the forecasting performance of 3PRF with real-world examples (as well as simulations). The first relates to forecasts of macroeconomic variables using data such as from the Mark Watson database mentioned previously in this blog. The second application relates to predicting asset prices, based on a factor model that ties individual assets’ price-dividend ratios to aggregate stock market fluctuations in order to uncover investors’ discount rates and dividend growth expectations.

Simulating the SPDR SPY Index

Here is a simulation of the SPDR SPY exchange traded fund index, using an autoregressive model estimated with maximum likehood methods, assuming the underlying distribution is not normal, but is instead a Student t distribution.

The underlying model is of the form

SPYRRt=a0+a1SPYRRt-1…a30SPYRRt-30

Where SPYRR is the daily return (trading day to trading day) of the SPY, based on closing prices.

This is a linear model, and an earlier post lists its exact parameters or, in other words, the coefficients attached to each of the lagged terms, as well as the value of the constant term.

This model is estimated on a training sample of daily returns from 1993 to 2008, and, is applied to out-of-sample data from 2008 to the present. It predicts about 53 percent of the signs of the next-day-returns correctly. The model generates more profits in the 2008 to the present period than a Buy & Hold strategy.

The simulation listed above uses the model equation and parameters, generating a series of 4000 values recursively, adding in randomized error terms from the fit of the equation to the training or estimation data.

This is work-in-progress. Currently, I am thinking about how to properly incorporate volatility. Obviously, any number of realizations are possible. The chart shows one of them, which has an uncanny resemblance to the actual historical series, due to the fact that volatility is created over certain parts of the simulation, in this case by chance.

To review, I set in motion the following process:

1. Predict a xt = f(xt-1,..,xt-30) based on the 30 coefficients and a constant term from the autoregressive model, applied to 30 preceding values of xt generated by this process (The estimation is initialized with the first 30 actual values of the test data).
2. Randomly select a residual for this xt based on the empirical distribution of errors from the fit of the predictive relationship to the training set.
3. Iterate.

The error distribution looks like this.

This is obviously not a normal distribution, since “too many” predictive errors are concentrated around the zero error line.

For puzzles and problems, this is a fertile area for research, and you can make money. But obviously, be careful.

In any case, I think this research, in an ultimate analysis, converges to the work being done by Didier Sornette and his co-researchers and co-authors. Sornette et al develop an approach through differential equations, focusing on critical points where a phase shift occurs in trading with a rapid collapse of an asset bubble.

This approach comes at similar, semi-periodic, logarithmically increasing values through linear autoregressive equations, which, as is well known, have complex dynamics when analyzed as difference equations.

The prejudice in economics and econometrics that “you can’t predict the stock market” is an impediment to integrating these methods.

While my research on modeling stock prices is a by-product of my general interest in forecasting and quantitative techniques, I may have an advantage because I will try stuff that more seasoned financial analysts may avoid, because they have been told it does not work.

So I maintain it is possible, at least in the era of quantitative easing (QE), to profit from autoregressive models of daily returns on a major index like the SPY. The models are, admittedly, weak predictors, but they interact with the weird error structure of SPY daily returns in interesting ways. And, furthermore, it is possible for anyone to verify my claims simply by calculating the predictions for the test period from 2008 to the present and then looking at what a Buy & Hold Strategy would have done over the same period.

In this post, I reverse the process. I take one of my autoregressive models and generate, by simulation, time series that look like historical SPY daily values.

On Sornette, about which I think we will be hearing more, since currently the US stock market seems to be in correction model, see – Turbulent times ahead: Q&A with economist Didier Sornette. Also check http://www.er.ethz.ch/presentations/index.