Category Archives: stock market forecasts

More on the “Efficiency” of US Stock Markets – Evidence from 1871 to 2003

In a pivotal article, Andrew Lo writes,

Many of the examples that behavioralists cite as violations of rationality that are inconsistent with market efficiency loss aversion, overconfidence, overreaction, mental accounting, and other behavioral biases are, in fact, consistent with an evolutionary model of individuals adapting to a changing environment via simple heuristics.

He also supplies an intriguing graph of the rolling first order autocorrelation of monthly returns of the S&P Composite Index from January 1971 to April 2003.

LoACchart

Lo notes the Random Walk Hypothesis implies that returns are serially uncorrelated, so the serial correlation coefficient ought to be zero – or at least, converging to zero over time as markets move into equilibrium.

However, the above chart shows this does not happen, although there are points in time when the first order serial correlation coefficient is small in magnitude, or even zero.

My point is that the first order serial correlation in daily returns for the S&P 500 is large enough for long enough periods to generate profits above a Buy-and-Hold strategy – that is, if one can negotiate the tricky milliseconds of trading at the end of each trading day.

The King Has No Clothes or Why There Is High Frequency Trading (HFT)

I often present at confabs where there are engineers with management or executive portfolios. You start the slides, but, beforehand, prepare for the tough question. Make sure the numbers in the tables add up and that round-off errors or simple typos do not creep in to mess things up.

To carry this on a bit, I recall a Hewlett Packard VP whose preoccupation during meetings was to fiddle with their calculator – which dates the story a little. In any case, the only thing that really interested them was to point out mistakes in the arithmetic. The idea is apparently that if you cannot do addition, why should anyone believe your more complex claims?

I’m bending this around to the theory of efficient markets and rational expectations, by the way.

And I’m playing the role of the engineer.

Rational Expectations

The theory of rational expectations dates at least to the work of Muth in the 1960’s, and is coupled with “efficient markets.”

Lim and Brooks explain market efficiency in – The Evolution of Stock Market Efficiency Over Time: A Survey of the Empirical Literature

The term ‘market efficiency’, formalized in the seminal review of Fama (1970), is generally referred to as the informational efficiency of financial markets which emphasizes the role of information in setting prices.. More specifically, the efficient markets hypothesis (EMH) defines an efficient market as one in which new information is quickly and correctly reflected in its current security price… the weak-form version….asserts that security prices fully reflect all information contained in the past price history of the market.

Lim and Brooks focus, among other things, on statistical tests for random walks in financial time series, noting this type of research is giving way to approaches highlighting adaptive expectations.

Proof US Stock Markets Are Not Efficient (or Maybe That HFT Saves the Concept)

I like to read mathematically grounded research, so I have looked a lot of the papers purporting to show that the hypothesis that stock prices are random walks cannot be rejected statistically.

But really there is a simple constructive proof that this literature is almost certainly wrong.

STEP 1: Grab the data. Download daily adjusted closing prices for the S&P 500 from some free site (e,g, Yahoo Finance). I did this again recently, collecting data back to 1990. Adjusted closing prices, of course, are based on closing prices for the trading day, adjusted for dividends and stock splits. Oh yeah, you may have to resort the data from oldest to newest, since a lot of sites present the newest data on top, originally.

Here’s a graph of the data, which should be very familiar by now.

adjCLPS&P

STEP 2: Create the relevant data structure. In the same spreadsheet, compute the trading-day-over-treading day growth in the adjusted closing price (ACP). Then, side-by-side with this growth rate of the ACP, create another series which, except for the first value, maps the growth in ACP for the previous trading day onto the growth of the ACP for any particular day. That gives you two columns of new data.

STEP 3: Run adaptive regressions. Most spreadsheet programs include an ordinary least squares (OLS) regression routine. Certainly, Excel does. In any case, you want to setup up a regression to predict the growth in the ACP, based on one trading lags in the growth of the ACP.

I did this, initially, to predict the growth in ACP for January 3, 2000, based on data extending back to January 3, 1990 – a total of 2528 trading days. Then, I estimated regressions going down for later dates with the same size time window of 2528 trading days.

The resulting “predictions” for the growth in ACP are out-of-sample, in the sense that each prediction stands outside the sample of historic data used to develop the regression parameters used to forecast it.

It needs to be said that these predictions for the growth of the adjusted closing price (ACP) are marginal, correctly predicting the sign of the ACP only about 53 percent of the time.

An interesting question, though, is whether these just barely predictive forecasts can be deployed in a successful trading model. Would a trading algorithm based on this autoregressive relationship beat the proverbial “buy-and-hold?”

So, for example, suppose we imagine that we can trade at closing each trading day, close enough to the actual closing prices.

Then, you get something like this, if you invest $100,000 at the beginning of 2000, and trade through last week. If the predicted growth in the ACP is positive, you buy at the previous day’s close. If not, you sell at the previous day’s close. For the Buy-and-Hold portfolio, you just invest the $100,000 January 3, 2000, and travel to Tahiti for 15 years or so.

BandHversusAR

So, as should be no surprise, the Buy-and-Hold strategy results in replicating the S&P 500 Index on a $100,000 base.

The trading strategy based on the simple first order autoregressive model, on the other hand, achieves more than twice these cumulative earnings.

Now I suppose you could say that all this was an accident, or that it was purely a matter of chance, distributed over more than 3,810 trading days. But I doubt it. After all, this trading interval 2000-2015 includes the worst economic crisis since before World War II.

Or you might claim that the profits from the simple AR trading strategy would be eaten up by transactions fees and taxes. On this point, there were 1,774 trades, for an average of $163 per trade. So, worst case, if trading costs $10 a transaction, and there is a tax rate of 40 percent, that leaves $156K over these 14-15 years in terms of take-away profit, or about $10,000 a year.

Where This May Go Wrong

This does sound like a paen to stock market investing – even “day-trading.”

What could go wrong?

Well, I assume here, of course, that exchange traded funds (ETF’s) tracking the S&P 500 can be bought and sold with the same tactics, as outlined here.

Beyond that, I don’t have access to the data currently (although I will soon), but I suspect high frequency trading (HFT) may stand in the way of realizing this marvelous investing strategy.

So remember you have to trade some small instant before market closing to implement this trading strategy. But that means you get into the turf of the high frequency traders. And, as previous posts here observe, all kinds of unusual things can happen in a blink of an eye, faster than any human response time.

So – a conjecture. I think that the choicest situations from the standpoint of this more or less macro interday perspective, may be precisely the places where you see huge spikes in the volume of HFT. This is a proposition that can be tested.

I also think something like this has to be appealed to in order to save the efficient markets hypothesis, or rational expectations. But in this case, it is not the rational expectations of human subjects, but the presumed rationality of algorithms and robots, as it were, which may be driving the market, when push comes to shove.

Top picture from CommSmart Global.

Scalability of the Pvar Stock Market Forecasting Approach

Ok, I am documenting and extending a method of forecasting stock market prices based on what I call Pvar models. Here Pvar stands for “proximity variable” – or, more specifically, variables based on the spread or difference between the opening price of a stock, ETF, or index, and the high or low of the previous period. These periods can be days, groups of days, weeks, months, and so forth.

I share features of these models and some representative output on this blog.

And, of course, I continue to have wider interests in forecasting controversies, issues, methods, as well as the global economy.

But for now, I’ve got hold of something, and since I appreciate your visits and comments, let’s talk about “scalability.”

Forecast Error and Data Frequency

Years ago, when I first heard of the M-competition (probably later than for some), I was intrigued by reports of how forecast error blows up “three or four periods in the forecast horizon,” almost no matter what the data frequency. So, if you develop a forecast model with monthly data, forecast error starts to explode three or four months into the forecast horizon. If you use quarterly data, you can push the error boundary out three or four quarters, and so forth.

I have not seen mention of this result so much recently, so my memory may be playing tricks.

But the basic concept seems sound. There is irreducible noise in data and in modeling. So whatever data frequency you are analyzing, it makes sense that forecast errors will start to balloon more or less at the same point in the forecast horizon – in terms of intervals of the data frequency you are analyzing.

Well, this concept seems emergent in forecasts of stock market prices, when I apply the analysis based on these proximity variables.

Prediction of Highs and Lows of Microsoft (MSFT) Stock at Different Data Frequencies

What I have discovered is that in order to predict over longer forecast horizons, when it comes to stock prices, it is necessary to look back over longer historical periods.

Here are some examples of scalability in forecasts of the high and low of MSFT.

Forecasting 20 trading days ahead, you get this type of chart for recent 20-day-periods.

MSFT20day

One of the important things to note is that these are out-of-sample forecasts, and that, generally, they encapsulate the actual closing prices for these 20 trading day periods.

Here is a comparable chart for 10 trading days.

MSFTHL10

Same data, forecasts also are out-of-sample, and, of course, there are more closing prices to chart, too.

Finally, here is a very busy chart with forecasts by trading day.

MSFTdaily

Now there are several key points to take away from these charts.

First, the predictions of MSFT high and low prices for these periods are developed by similar forecast models, at least with regard to the specification of explanatory variables. Also, the Pvar method works for specific stocks, as well as for stock market indexes and ETF’s that might track them.

However, and this is another key point, the definitions of these variables shift with the periods being considered.

So the high for MSFT by trading day is certainly different from the MSFT high over groups of 20 trading days, and so forth.

In any case, there is remarkable scalability with Pvar models, all of which suggests they capture some of the interplay between long and shorter term trading.

While I am handing out conjectures, here is another one.

I think it will be possible to conduct a “causal analysis” to show that the Pvar variables reflect or capture trader actions, and that these actions tend to drive the market.

Pvar Models for Forecasting Stock Prices

When I began this blog three years ago, I wanted to deepen my understanding of technique – especially stuff growing up alongside Big Data and machine learning.

I also was encouraged by Malcolm Gladwell’s 10,000 hour idea – finding it credible from past study of mathematical topics. So maybe my performance as a forecaster would improve by studying everything about the subject.

Little did I suspect I would myself stumble on a major forecasting discovery.

But, as I am wont to quote these days, even a blind pig uncovers a truffle from time to time.

Forecasting Stock Prices

My discovery pertains to forecasting stock prices.

Basically, I have stumbled on a method of developing much more accurate forecasts of high and low stock prices, given the opening price in a period. These periods can be days, groups of days, weeks, months, and, based on what I present here – quarters.

Additionally, I have discovered a way to translate these results into much more accurate forecasts of closing prices over long forecast horizons.

I would share the full details, except I need some official acknowledgement for my work (in process) and, of course, my procedures lead to profits, so I hope to recover some of what I have invested in this research.

Having struggled through a maze of ways of doing this, however, I feel comfortable sharing a key feature of my approach – which is that it is based on the spreads between opening prices and the high and low of previous periods. Hence, I call these “Pvar models” for proximity variable models.

There is really nothing in the literature like this, so far as I am able to determine – although the discussion of 52 week high investing captures some of the spirit.

S&P 500 Quarterly Forecasts

Let’s look an example – forecasting quarterly closing prices for the S&P 500, shown in this chart.

S&PQ

We are all familiar with this series. And I think most of us are worried that after the current runup, there may be another major correction.

In any case, this graph compares out-of-sample forecasts of ARIMA(1,1,0) and Pvar models. The ARIMA forecasts are estimated by the off-the-shelf automatic forecast program Forecast Pro. The Pvar models are estimated by ordinary least squares (OLS) regression, using Matlab and Excel spreadsheets.

CompPvarARIMA

The solid red line shows the movement of the S&P 500 from 2005 to just recently. Of course, the big dip in 2008 stands out.

The blue line charts out-of-sample forecasts of the Pvar model, which are from visual inspection, clearly superior to the ARIMA forecasts, in orange.

And note the meaning of “out-of-sample” here. Parameters of the Pvar and ARIMA models are estimated over historic data which do not include the prices in the period being forecast. So the results are strictly comparable with applying these models today and checking their performance over the next three months.

The following bar chart shows the forecast errors of the Pvar and ARIMA forecasts.

PvarARIMAcomp

Thus, the Pvar model forecasts are not always more accurate than ARIMA forecasts, but clearly do significantly better at major turning points, like the 2008 recession.

The mean absolute percent errors (MAPE) for the two approaches are 7.6 and 10.2 percent, respectively.

This comparison is intriguing, since Forecast Pro automatically selected an ARIMA(1,1,0) model in each instance of its application to this series. This involves autoregressions on differences of a time series, to some extent challenging the received wisdom that stock prices are random walks right there. But Pvar poses an even more significant challenge to versions of the efficient market hypothesis, since Pvar models pull variables from the time series to predict the time series – something you are really not supposed to be able to do, if markets are, as it were, “efficient.” Furthermore, this price predictability is persistent, and not just a fluke of some special period of market history.

I will have further comments on the scalability of this approach soon. Stay tuned.

Forecasting Google’s Stock Price (GOOG) On 20-Trading-Day Horizons

Google’s stock price (GOOG) is relatively volatile, as the following chart shows.

GOOG

So it’s interesting that a stock market forecasting algorithm can produce the following 20 Trading-Day-Ahead forecasts for GOOG, for the recent period.

GOG20

The forecasts in the above chart, as are those mentioned subsequently, are out-of-sample predictions. That is, the parameters of the forecast model – which I call the PVar model – are estimated over one set of historic prices. Then, the forecasts from PVar are generated with values for the explanatory variables that are “outside” or not the same as this historic data.

How good are these forecasts and how are they developed?

Well, generally forecasting algorithms are compared with benchmarks, such as an autoregressive model or a “no-change” forecast.

So I constructed an autoregressive (AR) model for the Google closing prices, sampled at 20 day frequencies. This model has ten lagged versions of the closing price series, so I do not just rely here on first order autocorrelations.

Here is a comparison of the 20 trading-day-ahead predictions of this AR model, the above “proximity variable” (PVar) model which I take credit for, and the actual closing prices.

compGOOG

As you can see, the AR model is worse in comparison to the PVar model, although they share some values at the end of the forecast series.

The mean absolute percent errors (MAPE) of the AR model for a period more extended than shown in the graph is 7.0, compared with 5.1 for PVar. This comparison is calculated over data from 4/20/2011.

So how do I do it?

Well, since these models show so much promise, it makes sense to keep working on them, making improvements. However, previous posts here give broad hints, indeed pretty well laying out the framework, at least on an introductory basis.

Essentially, I move from predicting highs and lows to predicting closing prices.

To predict highs and lows, my post “further research” states

Now, the predictive models for the daily high and low stock price are formulated, as before, keying off the opening price in each trading day. One of the key relationships is the proximity of the daily opening price to the previous period high. The other key relationship is the proximity of the daily opening price to the previous period low. Ordinary least squares (OLS) regression models can be developed which do a good job of predicting the direction of change of the daily high and low, based on knowledge of the opening price for the day.

Other posts present actual regression models, although these are definitely prototypes, based on what I know now.

Why Does This Work?

I’ll bet this works because investors often follow simple rules such as “buy when the opening price is sufficiently greater than the previous period high” or “sell, if the opening price is sufficiently lower than the previous period low.”

I have assembled evidence, based on time variation in the predictive coefficients of the PVar variables, which I probably will put out here sometime.

But the point is that momentum trading is a major part of stock market activity, not only in the United States, but globally. There’s even research claiming to show that momentum traders do better than others, although that’s controversial.

This means that the daily price record for a stock, the opening, high, low, and closing prices, encode information that investors are likely to draw upon over different investing horizons.

I’m pleased these insights open up many researchable questions. I predict all this will lead to wholly new generations of models in stock market analysis. And my guess, and so far it is largely just that, is that these models may prove more durable than many insights into patterns of stock market prices – due to a sort of self-confirming aspect.

On Self-Fulfilling Prophecy

In their excellent “Forecasting Stock Returns” in the Handbook of Economic Forecasting, David Rapach and Guofu Zhou write,

While stock return forecasting is fascinating, it can also be frustrating. Stock returns inherently contain a sizable unpredictable component, so that the best forecasting models can explain only a relatively small part of stock returns. Furthermore, competition among traders implies that once successful forecasting models are discovered, they will be readily adopted by others; the widespread adoption of successful forecasting models can then cause stock prices to move in a manner that eliminates the models’ forecasting ability..

Almost an article of faith currently, this perspective seems to rule out other reactions to forecasts which have been important in economic affairs, namely the self-fulfilling prophecy.

Now as “self-fulfilling prophecy” entered the lexicon, it was a prediction which originally was in error, but it became true, because people believed it was true and acted upon it.

Bank runs are the classic example.

The late Robert Merton wrote of the Last National Bank in his classic Social Theory and Social Structure, but there is no need for recourse to apocryphal history. Gary Richardson of the Federal Reserve Bank of Richmond has a nice writeup – Banking Panics of 1930 and 1931

..Caldwell was a rapidly expanding conglomerate and the largest financial holding company in the South. It provided its clients with an array of services – banking, brokerage, insurance – through an expanding chain controlled by its parent corporation headquartered in Nashville, Tennessee. The parent got into trouble when its leaders invested too heavily in securities markets and lost substantial sums when stock prices declined. In order to cover their own losses, the leaders drained cash from the corporations that they controlled.

On November 7, one of Caldwell’s principal subsidiaries, the Bank of Tennessee (Nashville) closed its doors. On November 12 and 17, Caldwell affiliates in Knoxville, Tennessee, and Louisville, Kentucky, also failed. The failures of these institutions triggered a correspondent cascade that forced scores of commercial banks to suspend operations. In communities where these banks closed, depositors panicked and withdrew funds en masse from other banks. Panic spread from town to town. Within a few weeks, hundreds of banks suspended operations. About one-third of these organizations reopened within a few months, but the majority were liquidated (Richardson 2007).

Of course, most of us know but choose to forget these examples, for a variety of reasons – the creation of the Federal Deposit Insurance Corporation has removed most of the threat, that was a long time ago, and so forth.

So it was with interest that I discovered a recent paper of researchers at Cal Tech and UCLA’s Andersson Management School The Self Fulfilling Prophecy of Popular Asset Pricing Models. The authors explore the impact of delegating investment decisions to investment professionals who, by all evidence, apply discounted cash flow models that are disconnected from investor’s individual utility functions.

Despite its elegance, the consumption-based model has one glaring deficiency.

The standard model and its more conventional variants have failed miserably at explaining the cross-section of returns; even tortured versions of the standard model have struggled to match data.

The authors then propose a Gendanken experiment where discounted cash flow models are used by the professional money managers who are delegated to invest by individuals.

The upshot –

Our thought experiment has an intriguing and heretofore unappreciated implication— there is a feedback relation between asset pricing models and the cross-section of expected returns. Our analysis implies that the cross-section of expected returns is not only described by theories of asset pricing, it is also determined by them.

I think Cornell and Hsu are on to something here.

More specifically, I have been trying to understand how to model a trading situation in which predictions of stock high and low prices in a period are self-confirming or self-fulfilling.

Suppose my prediction is that the daily high of Dazzle will be above yesterday’s daily high, if the opening price is above yesterday’s opening price. Then, if this persuades you to buy shares of Dazzle, it would seem that you contribute to the tendency for the stock price to increase. Furthermore, I don’t tell you exactly when the daily high will be reached, so I sort of put you in play. The onus is on you to make the right moves. The forecast does not come under suspicion.

As something of a data scientist, I think I can report that models of stock market trading at the level of agents participating in the market are not a major preoccupation of market analysts or theorists. The starting point seems to be Walras and the problem is how to set the price adjustment mechanism, since the tatonnement is obviously unrealistic

That then brings us probably to experimental economics, which shares a lot of turf with what is called behavioral economics.

The other possibility is simply to observe stock market prices and show that, quite generally, this type of rule must be at play and, because it is not inherently given to be true, it furthermore must be creating the conditions of its own success, to an extent.

High Frequency Trading and the Efficient Market Hypothesis

Working on a white paper about my recent findings, I stumbled on more confirmation of the decoupling of predictability and profitability in the market – the culprit being high frequency trading (HFT).

It makes a good story.

So I am looking for high quality stock data and came across the CalTech Quantitative Finance Group market data guide. They tout QuantQuote, which does look attractive, and was cited as the data source for – How And Why Kraft Surged 29% In 19 Seconds – on Seeking Alpha.

In early October 2012 (10/3/2012), shares of Kraft Foods Group, Inc surged to a high of $58.54 after opening at $45.36, and all in just 19.93 seconds. The Seeking Alpha post notes special circumstances, such as spinoff of Kraft Foods Group, Inc. (KRFT) from Modelez International, Inc., and addition of KRFT to the S&P500. Funds and ETF’s tracking the S&P500 then needed to hold KRFT, boosting prospects for KRFT’s price.

For 17 seconds and 229 milliseconds after opening October 3, 2012, the following situation, shown in the QuantQuote table, unfolded.

QuantQuote1

Times are given in milliseconds past midnight with the open at 34200000.

There is lots of information in this table – KRFT was not shortable (see the X in the Short Status column), and some trades were executed for dark pools of money, signified by the D in the Exch column.

In any case, things spin out of control a few milliseconds later, in ways and for reasons illustrated with further QuantQuote screen shots.

The moral –

So how do traders compete in a marketplace full of computers? The answer, ironically enough, is to not compete. Unless you are prepared to pay for a low latency feed and write software to react to market movements on the millisecond timescale, you simply will not win. As aptly shown by the QuantQuote tick data…, the required reaction time is on the order of 10 milliseconds. You could be the fastest human trader in the world chasing that spike, but 100% of the time, the computer will beat you to it.

CNN’s Watch high-speed trading in action is a good companion piece to the Seeking Alpha post.

HFT trading has grown by leaps and bounds, but estimates vary – partly because NASDAQ provides the only Datasets to academic researchers that directly classify HFT activity in U.S. equities. Even these do not provide complete coverage, excluding firms that also act as brokers for customers.

Still, the Security and Exchange Commission (SEC) 2014 Literature Review cites research showing that HFT accounted for about 70 percent of NASDAQ trades by dollar volume.

And associated with HFT are shorter holding times for stocks, now reputed to be as low as 22 seconds, although Barry Ritholz contests this sort of estimate.

Felix Salmon provides a list of the “evils” of HFT, suggesting a small transactions tax might mitigate many of these,

But my basic point is that the efficient market hypothesis (EMH) has been warped by technology.

I am leaning to the view that the stock market is predictable in broad outline.

But this predictability does not guarantee profitability. It really depends on how you handle entering the market to take or close out a position.

As Michael Lewis shows in Flash Boys, HFT can trump traders’ ability to make a profit

Stock Market Predictability

The research findings in recent posts here suggest that, in broad outline, the stock market is predictable.

This is one of the most intensively researched areas of financial econometrics.

There certainly is no shortage of studies claiming to forecast stock prices. See for example, Atsalakis, G., and K. Valavanis. “Surveying stock market forecasting techniques-part i: Conventional methods.” Journal of Computational Optimization in Economics and Finance 2.1 (2010): 45-92.

But the field is dominated by decades-long controversy over the efficient market hypothesis (EMH).

I’ve been reading Lim and Brooks outstanding survey article – The Evolution of Stock Market Efficiency Over Time: A Survey of the Empirical Literature.

They highlight two types of studies focusing on the validity of a weak form of the EMH which asserts that security prices fully reflect all information contained in the past price history of the market…

The first strand of studies, which is the focus of our survey, tests the predictability of security returns on the basis of past price changes. More specifically, previous studies in this sub-category employ a wide array of statistical tests to detect different types of deviations from a random walk in financial time series, such as linear serial correlations, unit root, low-dimensional chaos, nonlinear serial dependence and long memory. The second group of studies examines the profitability of trading strategies based on past returns, such as technical trading rules (see the survey paper by Park and Irwin, 2007), momentum and contrarian strategies (see references cited in Chou et al., 2007).

Another line, related to this second branch of research tests.. return predictability using other variables such as the dividend–price ratio, earnings–price ratio, book-to-market ratio and various measures of the interest rates.

Lim and Brooks note the tests for the semi-strong-form and strong-form EMH are renamed as event studies and tests for private information, respectively.

So bottom line – maybe your forecasting model predicts stock prices or rates of return over certain periods, but the real issue is whether it makes money. As Granger writes much earlier, mere forecastability is not enough.

I certainly respect this criterion, and recognize it is challenging. It may be possible to trade on the models of high and low stock prices over periods such I have been discussing, but I can also show you situations in which the irreducibly stochastic elements in the predictions can lead to losses. And avoiding these losses puts you into the field of higher frequency trading, where “all bets are off,” since there is so much that is not known about how that really works, particularly for individual investors.

My  primary purpose, however, in pursuing these types of models is originally not so much for trading (although that is seductive), but to explore new ways of forecasting turning points in economic time series. Confronted with the dismal record of macroeconomic forecasters, for example, one can see that predicting turning points is a truly fundamental problem. And this is true, I hardly need to add, for practical business forecasts. Your sales may do well – and exponential smoothing models may suffice – until the next phase of the business cycle, and so forth.

So I am amazed by the robustness of the turning point predictions from the longer (30 trading days, 40 days, etc.) groupings.

I just have never myself developed or probably even seen an example of predicting turning points as clearly as the one I presented in the previous post relating to the Hong Kong Hang Seng Index.

HSItp

A Simple Example of Stock Market Predictability

Again, without claims as to whether it will help you make money, I want to close this post today with comments about another area of stock price predictability – perhaps even simpler and more basic than relationships regarding the high and low stock price over various periods.

This is an exercise you can try for yourself in a few minutes, and which leads to remarkable predictive relationships which I do not find easy to identify or track in the existing literature regarding stock market predictability.

First, download the Yahoo Finance historical data for SPY, the ETF mirroring the S&P 500. This gives you a spreadsheet with approximately 5530 trading day values for the open, high, low, close, volume, and adjusted close. Sort from oldest to most recent. Then calculate trading-day over trading-day growth rates, for the opening prices and then the closing prices. Then, set up a data structure associating the opening price growth for day t with the closing price growth for day t-1. In other words, lag the growth in the closing prices.

Then, calculate the OLS regression of growth in lagged closing prices onto the growth in opening prices.

You should get something like,

openoverlcose

This is, of course, an Excel package regression output. It indicates that X Variable 1, which is the lagged growth in the closing prices, is highly significant as an explanatory variable, although the intercept or constant is not.

This equation explains about 21 percent of the variation in the growth data for the opening prices.

It also successfully predicts the direction of change of the opening price about 65 percent of the time, or considerably better than chance.

Not only that, but the two and three-period growth in the closing prices are successful predictors of the two and three-period growth in the opening prices.

And it probably is possible to improve the predictive performance of these equations by autocorrelation adjustments.

Comments

Why present the above example? Well, because I want to establish credibility on the point that there are clearly predictable aspects of stock prices, and ones you perhaps have not heard of heretofore.

The finance literature on stock market prediction and properties of stock market returns, not to mention volatility, is some of the most beautiful and complex technical literatures I know of.

But, still, I think new and important relationships can be discovered.

Whether this leads to profit-making is another question. And really, the standards have risen significantly in recent times, with program and high frequency trading possibly snatching profit opportunities from traders at the last microsecond.

So I think the more important point, from a policy standpoint if nothing else, may be whether it is possible to predict turning points – to predict broader movements of stock prices within which high frequency trading may be pushing the boundary.

Analysis of Highs and Lows of the Hong Kong Hang Seng Index, 1987 to the Present

I have discovered a fundamental feature of stock market prices, relating to prediction of the highs and lows in daily, weekly, monthly, and to other more arbitrary groupings of trading days in consecutive blocks.

What I have found is a degree of predictability previously unimagined with respect to forecasts of the high and low for a range of trading periods, extending from daily to 60 days so far.

Currently, I am writing up this research for journal submission, but I am documenting essential features of my findings on this blog.

A few days ago, I posted about the predictability of daily highs and lows for the SPY exchange traded fund. Subsequent posts highlight the generality of the result for the SPY, and more recently, for stocks such as common stock of the Ford Motor Company.

These posts present various graphs illustrating how well the prediction models for the high and low in periods capture the direction of change of the actual highs and lows. Generally, the models are right about 70 to 80 percent of the time, which is incredible.

Furthermore, since one of my long concerns has been to get better forward perspective on turning points – I am particularly interested in the evidence that these models also do fairly well as predicting turning points.

Finally, it is easy to show that these predictive models for the highs and lows of stocks and stock indices over various periods, furthermore, are not simply creations of modern program trading. The same regularities can be identified in earlier periods before easy access to computational power, in the 1980’s and early 1990’s, for example.

Hong Kong’s Hang Seng Index

Today, I want to reach out and look at international data and present findings for Hong Kong’s Hang Seng Index. I suspect Chinese investors will be interested in these results. Perhaps, releasing this information to such an active community of traders will test my hypothesis that these are self-fulfilling predictions, to a degree, and knowledge of their existence intensifies their predictive power.

A few facts about the Hang Seng Index – The Hang Seng Index (HSI) is a free-float adjusted, capitalization-weighted index of approximately 40 of the larger companies on the Hong Kong exchange. First published in 1969, the HSI, according to Investopedia, covers approximately 65% of the total market capitalization of the Hong Kong Stock Exchange. It is currently maintained by HSI Services Limited, a wholly owned subsidiary of Hang Seng Bank – the largest bank registered and listed in Hong Kong in terms of market capitalization.

For data, I download daily open, high, low, close and other metrics from Yahoo Finance. This data begins with the last day in 1986, continuing to the present.

The Hang Seng is a volatile index, as the following chart illustrates.

HSI

Now there are peculiarities about the data on HSI from Yahoo. Trading volumes are zero until 2001, for example, after which time large positive values are to be found in the volume column. Initially, I assume HSI was a pure index and later came to be actually traded in some fashion.

Nevertheless, the same type of predictive models can be developed for the Hang Seng Index, as can be estimated for the SPY and the US stocks.

Again, the key variables in these predictive relationships are the proximity of the period opening price to the previous period high and the previous period low. I estimate regressions with variables constructed from these explanatory variables, mapping them onto growth in period-by-period highs with ordinary least squares (OLS). I find the similar relationships for the Hang Seng in, say, a 30 day periodization as I estimate for the SPY ETF. At the same time there are differences, one of the most notable being the significantly less first order autocorrelation in the Hang Seng regression.

Essentially, higher growth rates for the period-over-previous-period high are predicted whenever the opening price of the current period is greater than the high of the previous period. There are other cases, however, and ultimately the rule is quantitative, taking into account the size of the growth rates for the high as well as these inequality relationships.

Findings

Here is another one of those charts showing the “hit-rate” for predictions of the direction of change of the sign of period-by-period growth rates for the high. In this case, the chart refers to daily trading data. The chart graphs 30 day moving averages of the proportions of time in which the predictive model forecasts the correct sign of the change or growth in the target or independent variable – the growth rate of daily highs (for consecutive trading days). Note that for recent years, the “hit rate” of the predictive model approaches 90 percent of the time, and all these are all out-of-sample predictions.

 HSIproportions

The relationship for the Hang Seng Index, thus, is powerful. Similarly impressive relationships can be derived to predict the daily lows and their direction of change.

But the result I really like with this data is developed with grouping the daily trading data by 30 day intervals.

HSItp

If you do this, you develop a tool which apparently is quite capable of predicting turning points in the Hang Seng.

Thus, between April 2005 and August 2012, a 30-day predictive model captures many of the key features of inflection and turning in the Hang Seng High for comparable periods.

Note that the predictive model makes these forecasts of the high for a period out-of-sample. All the relationships are estimated over historical data which do not include the high (or low) being predicted for the coming 30 day period. Only the opening price for the Hang Seng for that period is necessary.

Concluding Thoughts

I do not present the regression results here, but am pleased to share further information for readers responding to the Comments section to this blog (title ” Request for High/Low Model Information”) or who send requests to the following mail address: Clive Jones, PO Box 1009, Boulder, CO 80306 USA.

Top image from Ancient Chinese Fashion

Further Research into Predicting Daily and Other Period High and Low Stock Prices

The Internet is an amazing scientific tool. Communication of results is much faster, although, of course, with, potentially, dreck and misinformation. At the same time, pressures within the academy and Big Science seem to translate into a shocking amount of bogus research being touted. So maybe this free-for-all on the Web is where it’s at, if you are trying to get up to speed on new findings.

So this post today seeks to nail down some further and key points about predicting the high and low of stocks over various periods – conventionally, daily, weekly, and monthly periods, but also, as I have discovered, highs and lows over consecutive blocks of trading days ranging from 1 to 60 days, and probably more.

My recent posts focus on the SPY exchange traded fund, which tracks the S&P 500.

Yesterday, I formulated my general findings as follows:

For every period from daily periods to 60 day periods I have investigated, the high and low prices are “relatively” predictable and the direction of change from period to period is predictable, in backcasting analysis, about 70-80 percent of the time, on average.

In this post, let me show you the same basic relationship for a common stock – Ford Motor stock (F). I also consider data from the 1970’s, as well as recent data, to underline that modern program or high-speed computer-based algorithms have nothing to do with the underlying pattern.

I also show that the predictive model for the high in a period successfully captures turning points in the stock price in the 1970’s and more recently for 2008-2009.

Approach

Yahoo Finance, my free source of daily trading data, has history for Ford Motor stock dating back to June 1, 1972, charted as follows.

Ford

Now, the predictive models for the daily high and low stock price are formulated, as before, keying off the opening price in each trading day. One of the key relationships is the proximity of the daily opening price to the previous period high. The other key relationship is the proximity of the daily opening price to the previous period low. Ordinary least squares (OLS) regression models can be developed which do a good job of predicting the direction of change of the daily high and low, based on knowledge of the opening price for the day.

Predicting the Direction of Change of the High

As before, these models make correct predictions regarding the directions of change of the high and low about 70 percent of the time.

Here are 30 period moving averages for the 1970’s, showing the proportions of time the predictive model for the daily high is right about the direction of change.

MAFord

So the underlying relationship definitely holds in this age in which computer modeling of trading was in its infancy.

Here is a similar chart for the first decade of this century.

MAFordrecent

So whether we are considering the 1970’s or the last ten years, these predictive models do well in forecasting the direction of change of the high in daily (and it turns out other) periods.

Predicting Turning Points

We can make the same type of comparison – between the 1970’s and more recent years – for the capability of the predictive models to forecast turning points in the stock high (or low).

To do this usually requires aggregating the stock data. In the charts below, I aggregate to 7 trading day periods – not quite the same as weekly periods, since weekly segmentation can be short a day and so forth.

So the high which the predictive model focuses on is the high for the coming seven trading days, given the current day opening price.

Here are two charts, one for dates in the 1970’s and the other for a period in the recession of 2008-2009. For each chart I estimate OLS regressions with data predating each forecast of the high, based on blocks of 7 trading days.

70'sTP

These predictions of the high crisply capture most of the important turning and inflection point features.

recentTP

The application of similar predictive models for the 2008-2009 period is a little choppier, but does nail many of the important swings in the direction of change of the high of Ford Motor stock.

Concluding Thoughts

Well, this relationship between the opening prices and previous period highs and lows is highly predictive of the direction of change of the highs and lows in the current period – which can be a span of time from a day to 60 days in my findings.

These predictive models work for the S&P 500 and for individual stocks, like Ford Motor (and I might add Exxon and Microsoft).

They work in recent time periods and way back in the 1970’s.

And there’s more – for example, one could argue these patterns in the high and low prices are fractal, in the sense they represent “self similarity” at all (really many or a range of) time scales.

This is literally a new and fundamental regularity in stock prices.

Why does this work?

Well, the predictive models are closely related to very simple momentum trading strategies. But I think there is a lot of research to be done here. If you want further detail on any of this, please put your request in the Comments with the heading “Request for High/Low Model Information.”

Top picture from Strategic Monk.