Category Archives: regression forecasts

Time-Varying Coefficients and the Risk Environment for Investing

My research provides strong support for variation of key forecasting parameters over time, probably reflecting the underlying risk environment facing investors. This type of variation is suggested by Lo ( 2005).

So I find evidence for time varying coefficients for “proximity variables” that predict the high or low of a stock in a period, based on the spread between the opening price and the high or low price of the previous period.

Figure 1 charts the coefficients associated with explanatory variables that I call OPHt and OPLt. These coefficients are estimated in rolling regressions estimated with five years of history on trading day data for the S&P 500 stock index. The chart is generated with more than 3000 separate regressions.

Here OPHt is the difference between the opening price and the high of the previous period, scaled by the high of the previous period. Similarly, OPLt is the difference between the opening price and the low of the previous period, scaled by the low of the previous period. Such rolling regressions sometimes are called “adaptive regressions.”

Figure 1 Evidence for Time Varying Coefficients – Estimated Coefficients of OPHt and OPLt Over Study Sample

TvaryCoeff

Note the abrupt changes in the values of the coefficients of OPHt and OPLt in 2008.

These plausibly reflect stock market volatility in the Great Recession. After 2010 the value of both coefficients tends to move back to levels seen at the beginning of the study period.

This suggests trajectories influenced by the general climate of risk for investors and their risk preferences.

I am increasingly convinced the influence of these so-called proximity variables is based on heuristics such as “buy when the opening price is greater than the previous period high” or “sell, if the opening price is lower than the previous period low.”

Recall, for example, that the coefficient of OPHt measures the influence of the spread between the opening price and the previous period high on the growth in the daily high price.

The trajectory, shown in the narrow, black line, trends up in the approach to 2007. This may reflect investors’ greater inclination to buy the underlying stocks, when the opening price is above the previous period high. But then the market experiences the crisis of 2008, and investors abruptly back off from their eagerness to respond to this “buy” signal. With onset of the Great Recession, investors become increasingly risk adverse to such “buy” signals, only starting to recover their nerve after 2013.

A parallel interpretation of the trajectory of the coefficient of OPLt can be developed based on developments 2008-2009.

Time variation of these coefficients also has implications for out-of-sample forecast errors.

Thus, late 2008, when values of the coefficients of both OPH and OPL make almost vertical movements in opposite directions, is the period of maximum out-of-sample forecast errors. Forecast errors for daily highs, for example, reach a maximum of 8 percent in October 2008. This can be compared with typical errors of less than 0.4 percent for out-of-sample forecasts of daily highs with the proximity variable regressions.

Heuristics

Finally, I recall a German forecasting expert discussing heuristics with an example from baseball. I will try to find his name and give him proper credit. By the idea is that an outfielder trying to catch a flyball does not run calculations involving mass, angle, velocity, acceleration, windspeed, and so forth. Instead, basically, an outfielder runs toward the flyball, keeping it at a constant angle in his vision, so that it falls then into his glove at the last second. If the ball starts descending in his vision, as he approaches it, it may fall on the ground before him. If it starts to float higher in his perspective as he runs to get under it, it may soar over him, landing further back int he field.

I wonder whether similar arguments can be advanced for the strategy of buying based or selling based on spreads between the opening price in a period and the high and low prices in a previous period.

How Did My Forecast of the SPY High and Low Issued January 22 Do?

A couple of months ago, I applied the stock market forecasting approach based on what I call “proximity variables” to forward-looking forecasts – as opposed to “backcasts” testing against history.

I’m surprised now that I look back at this, because I offered a forecast for 40 trading days (a little foolhardy?).

In any case, I offered forecasts for the high and low of the exchange traded fund SPY, as follows:

What about the coming period of 40 trading days, starting from this morning’s (January 22, 2015) opening price for the SPY – $203.99?

Well, subject to qualifications I will state further on here, my estimates suggest the high for the period will be in the range of $215 and the period low will be around $194. Cents attached to these forecasts would be, of course, largely spurious precision.

In my opinion, these predictions are solid enough to suggest that no stock market crash is in the cards over the next 40 trading days, nor will there be a huge correction. Things look to trade within a range not too distant from the current situation, with some likelihood of higher highs.

It sounds a little like weather forecasting.

Well, 27 trading days have transpired since January 22, 2015 – more than half the proposed 40 associated with the forecast.

How did I do?

Here is a screenshot of the Yahoo Finance table showing opening, high, low, and closing prices since January 22, 2015.

SPYJan22etpassim

The bottom line – so far, so good. Neither the high nor low of any trading day has broached my proposed forecasts of $194 for the low and $215 for the high.

Now, I am pleased – a win just out of the gates with the new modeling approach.

However, I would caution readers seeking to use this for investment purposes. This approach recommends shorter term forecasts to focus in on the remaining days of the original forecast period. So, while I am encouraged the $215 high has not been broached, despite the hoopla about recent gains in the market, I don’t recommend taking $215 as an actual forecast at this point for the remaining 13 trading days – two or three weeks. Better forecasts are available from the model now.

“What are they?”

Well, there are a lot of moving parts in the computer programs to make these types of updates.

Still, it is interesting and relevant to forecasting practice – just how well do the models perform in real time?

So I am planning a new feature, a periodic update of stock market forecasts, with a look at how well these did. Give me a few days to get this up and running.

Analysis of Highs and Lows of the Hong Kong Hang Seng Index, 1987 to the Present

I have discovered a fundamental feature of stock market prices, relating to prediction of the highs and lows in daily, weekly, monthly, and to other more arbitrary groupings of trading days in consecutive blocks.

What I have found is a degree of predictability previously unimagined with respect to forecasts of the high and low for a range of trading periods, extending from daily to 60 days so far.

Currently, I am writing up this research for journal submission, but I am documenting essential features of my findings on this blog.

A few days ago, I posted about the predictability of daily highs and lows for the SPY exchange traded fund. Subsequent posts highlight the generality of the result for the SPY, and more recently, for stocks such as common stock of the Ford Motor Company.

These posts present various graphs illustrating how well the prediction models for the high and low in periods capture the direction of change of the actual highs and lows. Generally, the models are right about 70 to 80 percent of the time, which is incredible.

Furthermore, since one of my long concerns has been to get better forward perspective on turning points – I am particularly interested in the evidence that these models also do fairly well as predicting turning points.

Finally, it is easy to show that these predictive models for the highs and lows of stocks and stock indices over various periods, furthermore, are not simply creations of modern program trading. The same regularities can be identified in earlier periods before easy access to computational power, in the 1980’s and early 1990’s, for example.

Hong Kong’s Hang Seng Index

Today, I want to reach out and look at international data and present findings for Hong Kong’s Hang Seng Index. I suspect Chinese investors will be interested in these results. Perhaps, releasing this information to such an active community of traders will test my hypothesis that these are self-fulfilling predictions, to a degree, and knowledge of their existence intensifies their predictive power.

A few facts about the Hang Seng Index – The Hang Seng Index (HSI) is a free-float adjusted, capitalization-weighted index of approximately 40 of the larger companies on the Hong Kong exchange. First published in 1969, the HSI, according to Investopedia, covers approximately 65% of the total market capitalization of the Hong Kong Stock Exchange. It is currently maintained by HSI Services Limited, a wholly owned subsidiary of Hang Seng Bank – the largest bank registered and listed in Hong Kong in terms of market capitalization.

For data, I download daily open, high, low, close and other metrics from Yahoo Finance. This data begins with the last day in 1986, continuing to the present.

The Hang Seng is a volatile index, as the following chart illustrates.

HSI

Now there are peculiarities about the data on HSI from Yahoo. Trading volumes are zero until 2001, for example, after which time large positive values are to be found in the volume column. Initially, I assume HSI was a pure index and later came to be actually traded in some fashion.

Nevertheless, the same type of predictive models can be developed for the Hang Seng Index, as can be estimated for the SPY and the US stocks.

Again, the key variables in these predictive relationships are the proximity of the period opening price to the previous period high and the previous period low. I estimate regressions with variables constructed from these explanatory variables, mapping them onto growth in period-by-period highs with ordinary least squares (OLS). I find the similar relationships for the Hang Seng in, say, a 30 day periodization as I estimate for the SPY ETF. At the same time there are differences, one of the most notable being the significantly less first order autocorrelation in the Hang Seng regression.

Essentially, higher growth rates for the period-over-previous-period high are predicted whenever the opening price of the current period is greater than the high of the previous period. There are other cases, however, and ultimately the rule is quantitative, taking into account the size of the growth rates for the high as well as these inequality relationships.

Findings

Here is another one of those charts showing the “hit-rate” for predictions of the direction of change of the sign of period-by-period growth rates for the high. In this case, the chart refers to daily trading data. The chart graphs 30 day moving averages of the proportions of time in which the predictive model forecasts the correct sign of the change or growth in the target or independent variable – the growth rate of daily highs (for consecutive trading days). Note that for recent years, the “hit rate” of the predictive model approaches 90 percent of the time, and all these are all out-of-sample predictions.

 HSIproportions

The relationship for the Hang Seng Index, thus, is powerful. Similarly impressive relationships can be derived to predict the daily lows and their direction of change.

But the result I really like with this data is developed with grouping the daily trading data by 30 day intervals.

HSItp

If you do this, you develop a tool which apparently is quite capable of predicting turning points in the Hang Seng.

Thus, between April 2005 and August 2012, a 30-day predictive model captures many of the key features of inflection and turning in the Hang Seng High for comparable periods.

Note that the predictive model makes these forecasts of the high for a period out-of-sample. All the relationships are estimated over historical data which do not include the high (or low) being predicted for the coming 30 day period. Only the opening price for the Hang Seng for that period is necessary.

Concluding Thoughts

I do not present the regression results here, but am pleased to share further information for readers responding to the Comments section to this blog (title ” Request for High/Low Model Information”) or who send requests to the following mail address: Clive Jones, PO Box 1009, Boulder, CO 80306 USA.

Top image from Ancient Chinese Fashion

Forecasting the S&P 500 – Short and Long Time Horizons

Friends and acquaintances know that I believe I have discovered amazing, deep, and apparently simple predictability in aspects of the daily, weekly, monthly movement of stock prices.

People say – “don’t blog about it, keep it to yourself, and use it to make a million dollars.” That does sound attractive, but I guess I am a data scientist, rather than stock trader. Not only that, but the pattern looks to be self-fulfilling. Generally, the result of traders learning about this pattern should be to reinforce, rather than erase, it. There seems to be no other explanation consistent with its long historical vintage, nor the broadness of its presence. And that is big news to those of us who like to linger in the forecasting zoo.

I am going to share my discovery with you, at least in part, in this blog post.

But first, let me state some ground rules and describe the general tenor of my analysis. I am using OLS regression in spreadsheets at first, to explore the data. I am only interested, really, in models which have significant out-of-sample prediction capabilities. This means I estimate the regression model over a set of historical data and then use that model to predict – in this case the high and low of the SPY exchange traded fund. The predictions (or “retrodictions” or “backcasts”) are for observations on the high and low stock prices for various periods not included in the data used to estimate the model.

Now let’s look at the sort of data I use. The following table is from Yahoo Finance for the SPY. The site allows you to download this data into a spreadsheet, although you have to invert the order of the dating with a sort on the date. Note that all data is for trading days, and when I speak of N-day periods in the following, I mean periods of N trading days.

Yahoo

OK, now let me state my major result.

For every period from daily periods to 60 day periods I have investigated, the high and low prices are “relatively” predictable and the direction of change from period to period is predictable, in backcasting analysis, about 70-80 percent of the time, on average.

To give an example of a backcasting analysis, consider this chart from the period of free-fall in markets during 2008-2009, the Great Recession (click to enlarge).

40dayforecast

Now note that the indicated lines for the forecasts are not, strictly-speaking, 40-day-ahead forecasts. The forecasts are for the level of the high and low prices of the SPY which will be attained in each period of 40 trading days.

But the point is these rather time-indeterminate forecasts, when graphed alongside the actual highs and lows for the 40 trading day periods in question, are relatively predictive.

More to the point, the forecasts suffice to signal a key turning point in the SPY. Of course, it is simple to relate the high and low of the SPY for a period to relevant measures of the average or closing stock prices.

So seasoned forecasters and students of the markets and economics should know by this example that we are in terra incognita. Forecasting turning points out-of-sample is literally the toughest thing to do in forecasting, and certain with respect to the US stock market.

Many times technical analysts claim to predict turning points, but their results may seem more artistic, involving subtle interpretations of peaks and shoulders, as well as levels of support.

Now I don’t want to dismiss technical analysis, since, indeed, I believe my findings may prove out certain types of typical results in technical analysis. Or at least I can see a way to establish that claim, if things work out empirically.

Forecast of SPY High And Low for the Next Period of 40 Trading Days

What about the coming period of 40 trading days, starting from this morning’s (January 22, 2015) opening price for the SPY – $203.99?

Well, subject to qualifications I will state further on here, my estimates suggest the high for the period will be in the range of $215 and the period low will be around $194. Cents attached to these forecasts would be, of course, largely spurious precision.

In my opinion, these predictions are solid enough to suggest that no stock market crash is in the cards over the next 40 trading days, nor will there be a huge correction. Things look to trade within a range not too distant from the current situation, with some likelihood of higher highs.

It sounds a little like weather forecasting.

The Basic Model

Here is the actual regression output for predicting the 40 trading day high of the SPY.

40Highreg

This is a simpler than many of the models I have developed, since it only relies on one explanatory variable designated X Variable 1 in the Excel regression output. This explanatory variable is the ratio of the current opening price to the previous high for the 40 day trading period, all minus 1.

Let’s call this -1+ O/PH. Instances of -1+ O/PH are generated for data bunched by 40 trading day periods, and put into the regression against the growth in consecutive highs for these 40 day periods.

So what happens is this, apparently.

Everything depends on the opening price. If the high for the previous period equals the opening price, the predicted high for the next 40 day period will be the same as the high for the previous 40 day period.

If the previous high is less than the opening price, the prediction is that the next period high will be higher. Otherwise, the prediction is that the next period high will be lower.

This then looks like a trading rule which even the numerically challenged could follow.

And this sort of relationship is not something that has just emerged with quants and high frequency trading. On the contrary, it is possible to find the same type of rule operating with, say, Exxon’s stock (XOM) in the 1970’s and 1980’s.

But, before jumping to test this out completely, understand that the above regression is, in terms of most of my analysis, partial, missing at least one other important explanatory variable.

Previous posts, which employ similar forecasting models for daily, weekly, and monthly trading periods, show that these models can predict the direction of change of the period highs with about 70 to 80 percent accuracy (See, for example, here).

Provisos and Qualifications

In deploying OLS regression analysis, in Excel spreadsheets no less, I am aware there are many refinements which, logically, might be developed and which may improve forecast accuracy.

One thing I want to stress is that residuals of the OLS regressions on the growth in the period highs generally are not normally distributed. The distribution tends to be very peaked, reminiscent of discussions earlier in this blog of the Laplace distribution for Microsoft stock prices.

There also is first order serial correlation in many of these regressions. And, my software indicates that there could be autocorrelations extending deep into the historical record.

Finally, the regression coefficients may vary over the historical record.

Bottom LIne

I like Robb Hyndman’s often drawn distinction between modeling and reality. Somewhere Hyndman suggests that no model is right.

But this class of models has an extremely logical motivation, and is, as I say, relatively predictive – predictive enough to be useful in a number of contexts.

Momentum traders for years apparently have looked at the opening price and compared it with the highs (and lows) for previous periods – extending 60 days or more into history if not more – and decided whether to trade. If the opening price is greater than the past high, the next high is anticipated to be even higher. On this basis, stock may be purchased. That action tends to reinforce the relationship. So, in some sense, this is a self-fulfilling relationship.

To recapitulate – I can show you iron-clad, incontrovertible evidence that some fairly simple models built on daily trading data produce workable forecasts of the high and low for stock indexes and stocks. These forecasts are available for a variety of time periods, and, apparently, in backcasts can indicate turning points in the market.

As I say, feel free to request further documentation. I am preparing a write-up for a journal, and I think I can find a way to send out versions of this.

You can contact me confidentially via the Comments box below. Leave your email or phone number. Title the Comment “Request for High/Low Model Information” and the webmeister will forward it to me without having your request listed in the side panel of the blog.

Predicting the High and Low of SPY – and a Generalization

Well, here are some results on forecasting the daily low prices of the SPY exchange traded fund (ETF), complementing the previous post.

This line of inquiry has exploded into something much bigger, as I will relate shortly, but first ….

Predicting the Daily Low

This graph gives a flavor of the accuracy of a very simple bivariate regression, estimated on the daily percent changes in the lows for SPY.

DailyLowPredict

The blue line is the predicted percent change. And the orange line shows the actual percent changes of the daily lows for this period in early 2008.

These are out-of-sample results, in the sense the predicted percent changes in the lows are not included in the regression data used to develop the forecast model.

And considering we are predicting one component of volatility itself, the results are not bad.

For this analysis, I develop dynamic or adaptive regressions that start in August 2005 and run up to the present. The models predict the direction of change in the daily lows, on average, about 85 percent of the time over nearly 15 years.

The following chart shows 30 day rolling averages of the proportion of time the models predict the correct sign of the percent change for this period.

RatiosLow

This performance is produced by a simple bivariate regression of the daily percent change in the lows to the percent change in the previous low compared with the current daily opening price. So, of course, to get the explanatory variable you divide the previous trading day value for the low by the current day opening price and subtract 1 – and you can convert to percentages for purposes of display.

The equation is

PERCENT CHANGE IN CURRENT DAILY LOW = -0.00448 -0.951689(PERCENT CHANGE IN THE PREVIOUS DAILY LOW IN COMPARISON WITH THE CURRENT OPENING PRICE).

If the previous low is greater than the current opening price, the coefficient on this variable creates negative value which, added to the negative constant of the regression, would predict the daily low to drop.

If you have any role in instructing students, let me suggest this example. The data is readily accessible from Yahoo Finance (under SPY) and once you invert the calendar order of the data, the relevant percent changes are easy to compute, and then to plug into regressions with the Microsoft Excel Trend(.) function.

Now the amazing thing is that similar relationships operate over various time scales, both for predicting the low and the high in a group of trading days. I’m working up the post showing this right now.

There is, in other words, a remarkable thread running through daily, weekly, and monthly settings.

In closing here – a thought.

Often, when a predictive relationship relating to stock prices is put out there, you get the feeling the underlying regularities will evaporate, as traders jump on the opportunity.

But these predictive relationships for the high and low of the SPY may be examples of self-fulfiling prophesies.

In other words, if a trader learns that the daily, weekly, or monthly high or low is related to (a) the opening price, and (b) the high or low for the preceding period, whatever it may be, their actions could very well strengthen the relationship. So, predicting an increase in the daily high, a trader very well could go long, by buying the SPY at opening. The stock price should thereby go higher. Similarly, if a trader acts on information regarding predictions of a dropping low, they may short the SPY, which again could have the effect of causing the low to ratchet down further.

It would be fascinating if we could somehow establish that this is actually going on and sustaining this type of relationship.