# An Update on Bitcoin

Fairly hum-drum days of articles on testing for unit roots in time series led to discovery of an extraordinary new forecasting approach – using the future to predict the present.

Since virtually the only empirical application of the new technique is predicting bubbles in Bitcoin values, I include some of the recent news about Bitcoins at the end of the post.

Noncausal Autoregressive Models

I think you have to describe the forecasting approach recently considered by Lanne and Saikkonen, as well as Hencic, Gouriéroux and others, as “exciting,” even “sexy” in a Saturday Night Live sort of way.

Here is a brief description from a 2015 article in the Econometrics of Risk called Noncausal Autoregressive Model in Application to Bitcoin/USD Exchange Rates

I’ve always been a little behind the curve on lag operators, but basically Ψ(L-1) is a function of the standard lagged operators, while Φ(L) is a second function of offsets to future time periods.

To give an example, consider,

yt = k1yt-1+s1yt+1 + et

where subscripts t indicate time period.

In other words, the current value of the variable y is related to its immediately past value, and also to its future value, with an error term e being included.

This is what I mean by the future being used to predict the present.

Ordinarily in forecasting, one would consider such models rather fruitless. After all, you are trying to forecast y for period t+1, so how can you include this variable in the drivers for the forecasting setup?

But the surprising thing is that it is possible to estimate a relationship like this on historic data, and then take the estimated parameters and develop simulations which lead to predictions at the event horizon, of, say, the next period’s value of y.

This is explained in the paragraph following the one cited above –

In other words, because et in equation (1) can have infinite variance, it is definitely not normally distributed, or distributed according to a Gaussian probability distribution.

This is fascinating, since many financial time series are associated with nonGaussian error generating processes – distributions with fat tails that often are platykurtotic.

I recommend the Hencic and Gouriéroux article as a good read, as well as interesting analytics.

The authors proposed that a stationary time series is overlaid by explosive speculative periods, and that something can be abstracted in common from the structure of these speculative excesses.

Mt. Gox, of course, mentioned in this article, was raided in 2013 by Japanese authorities, after losses of more than \$465 million from Bitcoin holders.

Anyway, the bottom line is that I really, really like a forecast methodology based on recognition that data come from nonGaussian processes, and am intrigued by the fact that the ability to forecast with noncausal AR models depends on the error process being nonGaussian.

# Recent Events in the US Stock Market

The recent drop in US stocks is dramatic, as the steep falloff of the SPY exchange traded fund (ETF) Monday, August 24th– almost the most recent action in the chart – shows.

At the same time, this is by no means the steepest drop in closing prices, as the following chart of daily returns highlights.

TV commentators and others point to China and the prospective liftoff of US short term interest rates, with the Federal Reserve finally raising rates off the zero bound in – it was thought – September.

I have been impressed at the accuracy of Michael Pettis’ predictions in his China Financial Markets. Pettis has warned about a debt bubble in China for two years and consistently makes other correct calls. I have some first-hand experience doing business in China, and plan a longer post of the collapse of Chinese stock markets and the economic slowdown there.

You can imagine, if you will, a sort of global input-output table with a corresponding table of import/export flows. China has gotten a lot bigger since 2008-2009, absorbing significant amounts of the global output of iron and steel, oil, and other commodities.

Also, in 2008-2009 and in the earlier recession of 2001, China led the way to greater spending, buoying the global economy which, otherwise, was in sad shape. That’s not going to happen this time, if a real recession takes hold.

All very scary, but while the latest stuff took place, this is what I was doing.

In other words, I was the father of the groom at a splendid wedding for my younger son at the Pearl Buck estate just outside Philadelphia.

I also apologize for having the tools to predict the current downturn, at least after developments later last week, and not signaling readers.

But frankly, I’m not sure the extreme value prediction algorithms (EVPA) reliably predict major turning points. In fact, there seem to be outside influences at key junctures. However, once a correction is underway, predictability returns. Thus, the algorithms do more than simply forecast the growth in stock prices. The EVPA also works to predict the extent of downturns.

Here’s a tip. Start watching ratios such as those between  differences between the  opening price in a trading day and the previous day’s high or low price, divided by the previous day’s high or low price, respectively. Very significant predictors of the change in daily highs and lows, and with significance for changes in closing prices, if you bring some data analytics to bear.

# Monday Morning Stock Forecasts May 18 – Highs and Lows for SPY, QQQ, GE, and MSFT

Look at it this way. There are lots of business and finance blogs, but how many provide real-time forecasts, along with updates on how prior predictions performed?

Here on BusinessForecastBlog – we roll out forecasts of the highs and lows of a growing list of securities for the coming week on Monday morning, along with an update on past performance.

It’s a good discipline, if you think you have discovered a pattern which captures some part of the variation in future values of a variable. Do the backtesting, but also predict in real-time. It’s very unforgiving.

Here is today’s forecast, along with a recap for last week (click to enlarge).

There is an inevitable tendency to narrate these in “Nightly Business Report” fashion.

So the highs are going higher this week, and so are the lows, except perhaps for a slight drop in Microsoft’s low – almost within the margin of statistical noise. Not only that, but predicted increases in the high for QQQ are fairly substantial.

Last week’s forecasts were solid, in terms of forecast error, except Microsoft’s high came in above what was forecast. Still, -2.6 percent error is within the range of variation in the backtests for this security. Recall, too, that in the previous week, the forecast error for the high of MSFT was only .04 percent, almost spot on.

Since the market moved sideways for many securities, No Change forecasts were a strong competitor to the NPV (new proximity variable) forecasts. In fact, there was an 50:50 split. In half the four cases, the NPV forecasts performed better; in the other half, No Change forecasts had lower errors.

Direction of change predictions also came in about 50:50. They were correct for QQQ and SPY, and wrong for the two stocks.

Where is the Market Going?

This tool – forecasts based on the NPV algorithms – provides longer terms looks into the future, probably effectively up to one month ahead.

So in two weeks, I’m going to add that forecast to the mix. I think it may be important, incidentally, to conform to the standard practice of taking stock at the beginning of the month, rather than, say, simply going out four weeks from today.

To preview the power of this monthly NPV model, here are the backtests for the crisis months before and during the 2008 financial crisis.

This is a remarkable performance, really. Once the crash really gets underway in late Summer-Fall 2008, the NPV forecast drops in a straight-line descent, as do the actual monthly highs. There are some turning points in common, too, between the two series. And generally, even at the start of the process, the monthly NPV model provides good guidance as to the direction and magnitude of changes.

Over the next two weeks, I’m collecting high frequency data to see whether I can improve these forecasts with supplemental information – such as interest rates spreads and other variables available on a weekly or monthly basis.

In closing, let me plug Barry Eichengreen’s article in Syndicate An Economics to Fit the Facts.

Eichengreen writes,

While older members of the economics establishment continue to debate the merits of competing analytical frameworks, younger economists are bringing to bear important new evidence about how the economy operates.

It’s all about dealing with the wealth of data that is being collected everywhere now, and much less about theoretical disputes involving formal models.

Finally, it’s always necessary to insert a disclaimer, whenever one provides real-time, actionable forecasts. This stuff is for informational and scientific purposes only. It is not intended to provide recommendations for specific stock trading, and what you do on that score is strictly your own business and responsibility.

# How Did This Week’s Forecasts of QQQ, SPY, GE, and MSFT High Prices Do?

The following Table provides an update for this week’s forecasts of weekly highs for the securities currently being followed – QQQ, SPY, GE, and MSFT. Price forecasts and actual numbers are in US dollars.

This batch of forecasts performed extremely well in terms of absolute size of forecast errors, and, in addition, beating a “no change” forecast in three out of four predictions (exception being SPY) and correctly calling the change in direction of the high for QQQ.

It would be nice to be able to forecast the high prices for five-day-forward periods with the accuracy seen in the Microsoft (MSFT) forecast.

As all you market mavens know, US stock markets experienced a lot of declines in prices this week, so the highs for the week occurred Monday.

I’ve had several questions about the future direction of the market. Are declines going to be in the picture for the coming week, and even longer, for example?

I’ve been studying the capabilities of these algorithms to predict turning points in indexes and prices of individual securities. The answer is going to be probabilistic, and so is complicated. Sometimes the algorithm seems to provide pretty unambiguous signals as to turning points. In other instances, the tea leaves are harder to read, but, arguably, a signal does exist for most major turning points with the indexes I have focused on – SPY, QQQ, and the S&P 500.

So, the next question is – has the market hit a high for a week or a few weeks, or even perhaps a major turnaround?

Deploying these algorithms, coded in Visual Basic and C#, to attack this question is a little like moving a siege engine to the castle wall. A major undertaking.

I want to get there, but don’t want to be a “Chicken Little” saying “the sky is falling,” “the sky is falling.”

Stock Market Predictability

This little Monday morning exercise, which will be continued for the next several weeks, is providing evidence for the predictability of aspects of stock prices on a short term basis.

Once the basic facts are out there for everyone to see, a lot of questions arise. So what about new information? Surely yesterday’s open, high, low, and closing prices, along with similar information for previous days, do not encode an event like 9/11, or the revelation of massive accounting fraud with a stock issuing concern.

But apart from such surprises, I’m leaning to the notion that a lot more information about the general economy, company prospects and performance, and so forth are subtly embedded in the flow of price data.

I talked recently with an analyst who is applying methods from Kelly and Pruitt’s Market Expectations in the Cross Section of Present Values for wealth management clients. I hope to soon provide an “in-depth” on this type of applied stock market forecasting model, which focuses, incidentally, on stock market returns and dividends.

There is also some compelling research on the performance of momentum trading strategies which seems to indicate a higher level of predictability in stock prices than is commonly thought to exist.

Incidentally, in posting this slightly before the bell today, Friday, I am engaging in intra-day forecasting – betting that prices for these securities will stay below their earlier highs.

# Forecasts of High Prices for Week May 4-8 – QQQ, SPY, GE, and MSFT

Here are forecasts of high prices for key securities for this week, May 4-8, along with updates to check the accuracy of previous forecasts. So far, there is a new security each week. This week it is Microsoft (MSFT). Click on the Table to enlarge.

These forecasts from the new proximity variable (NPV) algorithms compete with the “no change” forecast – supposedly the optimal predictions for a random walk.

The NPV forecasts in the Table are more accurate than no change forecasts at 3:2 odds. That is, if you take into account the highs of the previous weeks for each security – actual high numbers not shown in the Table – the NPV forecasts are more accurate 4 out of 6 times.

This performance corresponds roughly with the improvements of the NPV approach over the no change forecasts in backtests back to 2003.

The advantages of the NPV approach extend beyond raw accuracy, measured here in simple percent terms, since the “no change” forecast is uninformative about the direction of change. The NPV forecasts, on the other hand, generally get the direction of change right. In the Table above, again considering data from weeks preceding those shown, the direction of change of the high forecasts is spot on every time. Backtests suggest the NPV algorithm will correctly predict the direction of change of the high price about 75 percent of the time for this five day interval.

It will be interesting to watch QQQ in this batch of forecasts. This ETF is forecast to decline week-over-week in terms of the high price.

Next week I plan to expand the forecast table to include forecasts of the low prices.

There is a lot of information here. Much of the finance literature focuses on the rates of returns based on closing prices, or adjusted closing prices. Perhaps analysts figure that attempting to predict “extreme values” is not a promising idea. Nothing could be further from the truth.

This week I plan a post showing how to identify turning points in the movement of major indices with the NPV algorithms. The concept is simple. I forecast the high and low over coming periods, like a day, five days, ten trading days and so forth. For these “nested forecast periods” the high for the week ahead must be greater than or equal to the high for tomorrow or shorter periods. This means when the price of the SPY or QQQ heads south, the predictions of the high of these ETF’s sort of freeze at a constant value. The predictions for the low, however, plummet.

Really pretty straight-forward.

I’ve appreciated and benefitted from your questions, comments, and suggestions. Keep them coming.

# Stock Market Predictability

The research findings in recent posts here suggest that, in broad outline, the stock market is predictable.

This is one of the most intensively researched areas of financial econometrics.

There certainly is no shortage of studies claiming to forecast stock prices. See for example, Atsalakis, G., and K. Valavanis. “Surveying stock market forecasting techniques-part i: Conventional methods.” Journal of Computational Optimization in Economics and Finance 2.1 (2010): 45-92.

But the field is dominated by decades-long controversy over the efficient market hypothesis (EMH).

I’ve been reading Lim and Brooks outstanding survey article – The Evolution of Stock Market Efficiency Over Time: A Survey of the Empirical Literature.

They highlight two types of studies focusing on the validity of a weak form of the EMH which asserts that security prices fully reflect all information contained in the past price history of the market…

The first strand of studies, which is the focus of our survey, tests the predictability of security returns on the basis of past price changes. More specifically, previous studies in this sub-category employ a wide array of statistical tests to detect different types of deviations from a random walk in financial time series, such as linear serial correlations, unit root, low-dimensional chaos, nonlinear serial dependence and long memory. The second group of studies examines the profitability of trading strategies based on past returns, such as technical trading rules (see the survey paper by Park and Irwin, 2007), momentum and contrarian strategies (see references cited in Chou et al., 2007).

Another line, related to this second branch of research tests.. return predictability using other variables such as the dividend–price ratio, earnings–price ratio, book-to-market ratio and various measures of the interest rates.

Lim and Brooks note the tests for the semi-strong-form and strong-form EMH are renamed as event studies and tests for private information, respectively.

So bottom line – maybe your forecasting model predicts stock prices or rates of return over certain periods, but the real issue is whether it makes money. As Granger writes much earlier, mere forecastability is not enough.

I certainly respect this criterion, and recognize it is challenging. It may be possible to trade on the models of high and low stock prices over periods such I have been discussing, but I can also show you situations in which the irreducibly stochastic elements in the predictions can lead to losses. And avoiding these losses puts you into the field of higher frequency trading, where “all bets are off,” since there is so much that is not known about how that really works, particularly for individual investors.

My  primary purpose, however, in pursuing these types of models is originally not so much for trading (although that is seductive), but to explore new ways of forecasting turning points in economic time series. Confronted with the dismal record of macroeconomic forecasters, for example, one can see that predicting turning points is a truly fundamental problem. And this is true, I hardly need to add, for practical business forecasts. Your sales may do well – and exponential smoothing models may suffice – until the next phase of the business cycle, and so forth.

So I am amazed by the robustness of the turning point predictions from the longer (30 trading days, 40 days, etc.) groupings.

I just have never myself developed or probably even seen an example of predicting turning points as clearly as the one I presented in the previous post relating to the Hong Kong Hang Seng Index.

A Simple Example of Stock Market Predictability

Again, without claims as to whether it will help you make money, I want to close this post today with comments about another area of stock price predictability – perhaps even simpler and more basic than relationships regarding the high and low stock price over various periods.

This is an exercise you can try for yourself in a few minutes, and which leads to remarkable predictive relationships which I do not find easy to identify or track in the existing literature regarding stock market predictability.

First, download the Yahoo Finance historical data for SPY, the ETF mirroring the S&P 500. This gives you a spreadsheet with approximately 5530 trading day values for the open, high, low, close, volume, and adjusted close. Sort from oldest to most recent. Then calculate trading-day over trading-day growth rates, for the opening prices and then the closing prices. Then, set up a data structure associating the opening price growth for day t with the closing price growth for day t-1. In other words, lag the growth in the closing prices.

Then, calculate the OLS regression of growth in lagged closing prices onto the growth in opening prices.

You should get something like,

This is, of course, an Excel package regression output. It indicates that X Variable 1, which is the lagged growth in the closing prices, is highly significant as an explanatory variable, although the intercept or constant is not.

This equation explains about 21 percent of the variation in the growth data for the opening prices.

It also successfully predicts the direction of change of the opening price about 65 percent of the time, or considerably better than chance.

Not only that, but the two and three-period growth in the closing prices are successful predictors of the two and three-period growth in the opening prices.

And it probably is possible to improve the predictive performance of these equations by autocorrelation adjustments.

Why present the above example? Well, because I want to establish credibility on the point that there are clearly predictable aspects of stock prices, and ones you perhaps have not heard of heretofore.

The finance literature on stock market prediction and properties of stock market returns, not to mention volatility, is some of the most beautiful and complex technical literatures I know of.

But, still, I think new and important relationships can be discovered.

Whether this leads to profit-making is another question. And really, the standards have risen significantly in recent times, with program and high frequency trading possibly snatching profit opportunities from traders at the last microsecond.

So I think the more important point, from a policy standpoint if nothing else, may be whether it is possible to predict turning points – to predict broader movements of stock prices within which high frequency trading may be pushing the boundary.

# Analysis of Highs and Lows of the Hong Kong Hang Seng Index, 1987 to the Present

I have discovered a fundamental feature of stock market prices, relating to prediction of the highs and lows in daily, weekly, monthly, and to other more arbitrary groupings of trading days in consecutive blocks.

What I have found is a degree of predictability previously unimagined with respect to forecasts of the high and low for a range of trading periods, extending from daily to 60 days so far.

Currently, I am writing up this research for journal submission, but I am documenting essential features of my findings on this blog.

A few days ago, I posted about the predictability of daily highs and lows for the SPY exchange traded fund. Subsequent posts highlight the generality of the result for the SPY, and more recently, for stocks such as common stock of the Ford Motor Company.

These posts present various graphs illustrating how well the prediction models for the high and low in periods capture the direction of change of the actual highs and lows. Generally, the models are right about 70 to 80 percent of the time, which is incredible.

Furthermore, since one of my long concerns has been to get better forward perspective on turning points – I am particularly interested in the evidence that these models also do fairly well as predicting turning points.

Finally, it is easy to show that these predictive models for the highs and lows of stocks and stock indices over various periods, furthermore, are not simply creations of modern program trading. The same regularities can be identified in earlier periods before easy access to computational power, in the 1980’s and early 1990’s, for example.

Hong Kong’s Hang Seng Index

Today, I want to reach out and look at international data and present findings for Hong Kong’s Hang Seng Index. I suspect Chinese investors will be interested in these results. Perhaps, releasing this information to such an active community of traders will test my hypothesis that these are self-fulfilling predictions, to a degree, and knowledge of their existence intensifies their predictive power.

A few facts about the Hang Seng Index – The Hang Seng Index (HSI) is a free-float adjusted, capitalization-weighted index of approximately 40 of the larger companies on the Hong Kong exchange. First published in 1969, the HSI, according to Investopedia, covers approximately 65% of the total market capitalization of the Hong Kong Stock Exchange. It is currently maintained by HSI Services Limited, a wholly owned subsidiary of Hang Seng Bank – the largest bank registered and listed in Hong Kong in terms of market capitalization.

For data, I download daily open, high, low, close and other metrics from Yahoo Finance. This data begins with the last day in 1986, continuing to the present.

The Hang Seng is a volatile index, as the following chart illustrates.

Now there are peculiarities about the data on HSI from Yahoo. Trading volumes are zero until 2001, for example, after which time large positive values are to be found in the volume column. Initially, I assume HSI was a pure index and later came to be actually traded in some fashion.

Nevertheless, the same type of predictive models can be developed for the Hang Seng Index, as can be estimated for the SPY and the US stocks.

Again, the key variables in these predictive relationships are the proximity of the period opening price to the previous period high and the previous period low. I estimate regressions with variables constructed from these explanatory variables, mapping them onto growth in period-by-period highs with ordinary least squares (OLS). I find the similar relationships for the Hang Seng in, say, a 30 day periodization as I estimate for the SPY ETF. At the same time there are differences, one of the most notable being the significantly less first order autocorrelation in the Hang Seng regression.

Essentially, higher growth rates for the period-over-previous-period high are predicted whenever the opening price of the current period is greater than the high of the previous period. There are other cases, however, and ultimately the rule is quantitative, taking into account the size of the growth rates for the high as well as these inequality relationships.

Findings

Here is another one of those charts showing the “hit-rate” for predictions of the direction of change of the sign of period-by-period growth rates for the high. In this case, the chart refers to daily trading data. The chart graphs 30 day moving averages of the proportions of time in which the predictive model forecasts the correct sign of the change or growth in the target or independent variable – the growth rate of daily highs (for consecutive trading days). Note that for recent years, the “hit rate” of the predictive model approaches 90 percent of the time, and all these are all out-of-sample predictions.

The relationship for the Hang Seng Index, thus, is powerful. Similarly impressive relationships can be derived to predict the daily lows and their direction of change.

But the result I really like with this data is developed with grouping the daily trading data by 30 day intervals.

If you do this, you develop a tool which apparently is quite capable of predicting turning points in the Hang Seng.

Thus, between April 2005 and August 2012, a 30-day predictive model captures many of the key features of inflection and turning in the Hang Seng High for comparable periods.

Note that the predictive model makes these forecasts of the high for a period out-of-sample. All the relationships are estimated over historical data which do not include the high (or low) being predicted for the coming 30 day period. Only the opening price for the Hang Seng for that period is necessary.

Concluding Thoughts

I do not present the regression results here, but am pleased to share further information for readers responding to the Comments section to this blog (title ” Request for High/Low Model Information”) or who send requests to the following mail address: Clive Jones, PO Box 1009, Boulder, CO 80306 USA.

Top image from Ancient Chinese Fashion

# Further Research into Predicting Daily and Other Period High and Low Stock Prices

The Internet is an amazing scientific tool. Communication of results is much faster, although, of course, with, potentially, dreck and misinformation. At the same time, pressures within the academy and Big Science seem to translate into a shocking amount of bogus research being touted. So maybe this free-for-all on the Web is where it’s at, if you are trying to get up to speed on new findings.

So this post today seeks to nail down some further and key points about predicting the high and low of stocks over various periods – conventionally, daily, weekly, and monthly periods, but also, as I have discovered, highs and lows over consecutive blocks of trading days ranging from 1 to 60 days, and probably more.

My recent posts focus on the SPY exchange traded fund, which tracks the S&P 500.

Yesterday, I formulated my general findings as follows:

For every period from daily periods to 60 day periods I have investigated, the high and low prices are “relatively” predictable and the direction of change from period to period is predictable, in backcasting analysis, about 70-80 percent of the time, on average.

In this post, let me show you the same basic relationship for a common stock – Ford Motor stock (F). I also consider data from the 1970’s, as well as recent data, to underline that modern program or high-speed computer-based algorithms have nothing to do with the underlying pattern.

I also show that the predictive model for the high in a period successfully captures turning points in the stock price in the 1970’s and more recently for 2008-2009.

Approach

Yahoo Finance, my free source of daily trading data, has history for Ford Motor stock dating back to June 1, 1972, charted as follows.

Now, the predictive models for the daily high and low stock price are formulated, as before, keying off the opening price in each trading day. One of the key relationships is the proximity of the daily opening price to the previous period high. The other key relationship is the proximity of the daily opening price to the previous period low. Ordinary least squares (OLS) regression models can be developed which do a good job of predicting the direction of change of the daily high and low, based on knowledge of the opening price for the day.

Predicting the Direction of Change of the High

As before, these models make correct predictions regarding the directions of change of the high and low about 70 percent of the time.

Here are 30 period moving averages for the 1970’s, showing the proportions of time the predictive model for the daily high is right about the direction of change.

So the underlying relationship definitely holds in this age in which computer modeling of trading was in its infancy.

Here is a similar chart for the first decade of this century.

So whether we are considering the 1970’s or the last ten years, these predictive models do well in forecasting the direction of change of the high in daily (and it turns out other) periods.

Predicting Turning Points

We can make the same type of comparison – between the 1970’s and more recent years – for the capability of the predictive models to forecast turning points in the stock high (or low).

To do this usually requires aggregating the stock data. In the charts below, I aggregate to 7 trading day periods – not quite the same as weekly periods, since weekly segmentation can be short a day and so forth.

So the high which the predictive model focuses on is the high for the coming seven trading days, given the current day opening price.

Here are two charts, one for dates in the 1970’s and the other for a period in the recession of 2008-2009. For each chart I estimate OLS regressions with data predating each forecast of the high, based on blocks of 7 trading days.

These predictions of the high crisply capture most of the important turning and inflection point features.

The application of similar predictive models for the 2008-2009 period is a little choppier, but does nail many of the important swings in the direction of change of the high of Ford Motor stock.

Concluding Thoughts

Well, this relationship between the opening prices and previous period highs and lows is highly predictive of the direction of change of the highs and lows in the current period – which can be a span of time from a day to 60 days in my findings.

These predictive models work for the S&P 500 and for individual stocks, like Ford Motor (and I might add Exxon and Microsoft).

They work in recent time periods and way back in the 1970’s.

And there’s more – for example, one could argue these patterns in the high and low prices are fractal, in the sense they represent “self similarity” at all (really many or a range of) time scales.

This is literally a new and fundamental regularity in stock prices.

Why does this work?

Well, the predictive models are closely related to very simple momentum trading strategies. But I think there is a lot of research to be done here. If you want further detail on any of this, please put your request in the Comments with the heading “Request for High/Low Model Information.”

Top picture from Strategic Monk.

# Forecasting the S&P 500 – Short and Long Time Horizons

Friends and acquaintances know that I believe I have discovered amazing, deep, and apparently simple predictability in aspects of the daily, weekly, monthly movement of stock prices.

People say – “don’t blog about it, keep it to yourself, and use it to make a million dollars.” That does sound attractive, but I guess I am a data scientist, rather than stock trader. Not only that, but the pattern looks to be self-fulfilling. Generally, the result of traders learning about this pattern should be to reinforce, rather than erase, it. There seems to be no other explanation consistent with its long historical vintage, nor the broadness of its presence. And that is big news to those of us who like to linger in the forecasting zoo.

I am going to share my discovery with you, at least in part, in this blog post.

But first, let me state some ground rules and describe the general tenor of my analysis. I am using OLS regression in spreadsheets at first, to explore the data. I am only interested, really, in models which have significant out-of-sample prediction capabilities. This means I estimate the regression model over a set of historical data and then use that model to predict – in this case the high and low of the SPY exchange traded fund. The predictions (or “retrodictions” or “backcasts”) are for observations on the high and low stock prices for various periods not included in the data used to estimate the model.

Now let’s look at the sort of data I use. The following table is from Yahoo Finance for the SPY. The site allows you to download this data into a spreadsheet, although you have to invert the order of the dating with a sort on the date. Note that all data is for trading days, and when I speak of N-day periods in the following, I mean periods of N trading days.

OK, now let me state my major result.

For every period from daily periods to 60 day periods I have investigated, the high and low prices are “relatively” predictable and the direction of change from period to period is predictable, in backcasting analysis, about 70-80 percent of the time, on average.

To give an example of a backcasting analysis, consider this chart from the period of free-fall in markets during 2008-2009, the Great Recession (click to enlarge).

Now note that the indicated lines for the forecasts are not, strictly-speaking, 40-day-ahead forecasts. The forecasts are for the level of the high and low prices of the SPY which will be attained in each period of 40 trading days.

But the point is these rather time-indeterminate forecasts, when graphed alongside the actual highs and lows for the 40 trading day periods in question, are relatively predictive.

More to the point, the forecasts suffice to signal a key turning point in the SPY. Of course, it is simple to relate the high and low of the SPY for a period to relevant measures of the average or closing stock prices.

So seasoned forecasters and students of the markets and economics should know by this example that we are in terra incognita. Forecasting turning points out-of-sample is literally the toughest thing to do in forecasting, and certain with respect to the US stock market.

Many times technical analysts claim to predict turning points, but their results may seem more artistic, involving subtle interpretations of peaks and shoulders, as well as levels of support.

Now I don’t want to dismiss technical analysis, since, indeed, I believe my findings may prove out certain types of typical results in technical analysis. Or at least I can see a way to establish that claim, if things work out empirically.

Forecast of SPY High And Low for the Next Period of 40 Trading Days

What about the coming period of 40 trading days, starting from this morning’s (January 22, 2015) opening price for the SPY – \$203.99?

Well, subject to qualifications I will state further on here, my estimates suggest the high for the period will be in the range of \$215 and the period low will be around \$194. Cents attached to these forecasts would be, of course, largely spurious precision.

In my opinion, these predictions are solid enough to suggest that no stock market crash is in the cards over the next 40 trading days, nor will there be a huge correction. Things look to trade within a range not too distant from the current situation, with some likelihood of higher highs.

It sounds a little like weather forecasting.

The Basic Model

Here is the actual regression output for predicting the 40 trading day high of the SPY.

This is a simpler than many of the models I have developed, since it only relies on one explanatory variable designated X Variable 1 in the Excel regression output. This explanatory variable is the ratio of the current opening price to the previous high for the 40 day trading period, all minus 1.

Let’s call this -1+ O/PH. Instances of -1+ O/PH are generated for data bunched by 40 trading day periods, and put into the regression against the growth in consecutive highs for these 40 day periods.

So what happens is this, apparently.

Everything depends on the opening price. If the high for the previous period equals the opening price, the predicted high for the next 40 day period will be the same as the high for the previous 40 day period.

If the previous high is less than the opening price, the prediction is that the next period high will be higher. Otherwise, the prediction is that the next period high will be lower.

This then looks like a trading rule which even the numerically challenged could follow.

And this sort of relationship is not something that has just emerged with quants and high frequency trading. On the contrary, it is possible to find the same type of rule operating with, say, Exxon’s stock (XOM) in the 1970’s and 1980’s.

But, before jumping to test this out completely, understand that the above regression is, in terms of most of my analysis, partial, missing at least one other important explanatory variable.

Previous posts, which employ similar forecasting models for daily, weekly, and monthly trading periods, show that these models can predict the direction of change of the period highs with about 70 to 80 percent accuracy (See, for example, here).

Provisos and Qualifications

In deploying OLS regression analysis, in Excel spreadsheets no less, I am aware there are many refinements which, logically, might be developed and which may improve forecast accuracy.

One thing I want to stress is that residuals of the OLS regressions on the growth in the period highs generally are not normally distributed. The distribution tends to be very peaked, reminiscent of discussions earlier in this blog of the Laplace distribution for Microsoft stock prices.

There also is first order serial correlation in many of these regressions. And, my software indicates that there could be autocorrelations extending deep into the historical record.

Finally, the regression coefficients may vary over the historical record.

Bottom LIne

I like Robb Hyndman’s often drawn distinction between modeling and reality. Somewhere Hyndman suggests that no model is right.

But this class of models has an extremely logical motivation, and is, as I say, relatively predictive – predictive enough to be useful in a number of contexts.

Momentum traders for years apparently have looked at the opening price and compared it with the highs (and lows) for previous periods – extending 60 days or more into history if not more – and decided whether to trade. If the opening price is greater than the past high, the next high is anticipated to be even higher. On this basis, stock may be purchased. That action tends to reinforce the relationship. So, in some sense, this is a self-fulfilling relationship.

To recapitulate – I can show you iron-clad, incontrovertible evidence that some fairly simple models built on daily trading data produce workable forecasts of the high and low for stock indexes and stocks. These forecasts are available for a variety of time periods, and, apparently, in backcasts can indicate turning points in the market.

As I say, feel free to request further documentation. I am preparing a write-up for a journal, and I think I can find a way to send out versions of this.

You can contact me confidentially via the Comments box below. Leave your email or phone number. Title the Comment “Request for High/Low Model Information” and the webmeister will forward it to me without having your request listed in the side panel of the blog.

# Forecasting the Downswing in Markets

I got a chance to work with the problem of forecasting during a business downturn at Microsoft 2007-2010.

Usually, a recession is not good for a forecasting team. There is a tendency to shoot the messenger bearing the bad news. Cost cutting often falls on marketing first, which often is where forecasting is housed.

But Microsoft in 2007 was a company which, based on past experience, looked on recessions with a certain aplomb. Company revenues continued to climb during the recession of 2001 and also during the previous recession in the early 1990’s, when company revenues were smaller.

But the plunge in markets in late 2008 was scary. Microsoft’s executive team wanted answers. Since there were few forthcoming from the usual market research vendors – vendors seemed sort of “paralyzed” in bringing out updates – management looked within the organization.

I was part of a team that got this assignment.

We developed a model to forecast global software sales across more than 80 national and regional markets. Forecasts, at one point, were utilized in deliberations of the finance directors, developing budgets for FY2010. Our Model, by several performance comparisons, did as well or better than what was available in the belated efforts of the market research vendors.

This was a formative experience for me, because a lot of what I did, as the primary statistical or econometric modeler, was seat-of-the-pants. But I tried a lot of things.

That’s one reason why this blog explores method and technique – an area of forecasting that, currently, is exploding.

Importance of the Problem

Forecasting the downswing in markets can be vitally important for an organization, or an investor, but the first requirement is to keep your wits. All too often there are across-the-board cuts.

A targeted approach can be better. All market corrections, inflections, and business downturns come to an end. Growth resumes somewhere, and then picks up generally. Companies that cut to the bone are poorly prepared for the future and can pay heavily in terms of loss of market share. Also, re-assembling the talent pool currently serving the organization can be very expensive.

But how do you set reasonable targets, in essence – make intelligent decisions about cutbacks?

I think there are many more answers than are easily available in the management literature at present.

But one thing you need to do is get a handle on the overall swing of markets. How long will the downturn continue, for example?

For someone concerned with stocks, how long and how far will the correction go? Obviously, perspective on this can inform shorting the market, which, my research suggests, is an important source of profits for successful investors.

A New Approach – Deploying high frequency data

Based on recent explorations, I’m optimistic it will be possible to get several weeks lead-time on releases of key US quarterly macroeconomic metrics in the next downturn.

My last post, for example, has this graph.

Note how the orange line hugs the blue line during the descent 2008-2009.

This orange line is the out-of-sample forecast of quarterly nominal GDP growth based on the quarter previous GDP and suitable lagged values of the monthly Chicago Fed National Activity Index. The blue line, of course, is actual GDP growth.

The official name for this is Nowcasting and MIDAS or Mixed Data Sampling techniques are widely-discussed approaches to this problem.

But because I was only mapping monthly and not, say, daily values onto quarterly values, I was able to simply specify the last period quarterly value and fifteen lagged values of the CFNAI in a straight-forward regression.

And in reviewing literature on MIDAS and mixing data frequencies, it is clear to me that, often, it is not necessary to calibrate polynomial lag expressions to encapsulate all the higher frequency data, as in the classic MIDAS approach.

Instead, one can deploy all the “many predictors” techniques developed over the past decade or so, starting with the work of Stock and Watson and factor analysis. These methods also can bring “ragged edge” data into play, or data with different release dates, if not different fundamental frequencies.

So, for example, you could specify daily data against quarterly data, involving perhaps several financial variables with deep lags – maybe totaling more explanatory variables than observations on the quarterly or lower frequency target variable – and wrap the whole estimation up in a bundle with ridge regression or the LASSO. You are really only interested in the result, the prediction of the next value for the quarterly metric, rather than unbiased estimates of the coefficients of explanatory variables.

Or you could run a principal component analysis of the data on explanatory variables, including a rag-tag collection of daily, weekly, and monthly metrics, as well as one or more lagged values of the higher frequency variable (quarterly GDP growth in the graph above).

Dynamic principal components also are a possibility, if anyone can figure out the estimation algorithms to move into a predictive mode.

Being able to put together predictor variables of all different frequencies and reporting periods is really exciting. Maybe in some way this is really what Big Data means in predictive analytics. But, of course, progress in this area is wholly empirical, it not being clear what higher frequency series can successfully map onto the big news indices, until the analysis is performed. And I think it is important to stress the importance of out-of-sample testing of the models, perhaps using cross-validation to estimate parameters if there is simply not enough data.

One thing I believe is for sure, however, and that is we will not be in the dark for so long during the next major downturn. It will be possible to  deploy all sorts of higher frequency data to chart the trajectory of the downturn, probably allowing a call on the turning point sooner than if we waited for the “big number” to come out officially.

Top picture courtesy of the Bridgespan Group