Category Archives: stock trading algorithms

Back to the Drawing Board

Well, not exactly, since I never left it.

But the US and other markets opened higher today, after round-the-clock negotiations on the Greek debt.

I notice that Jeff Miller of Dash of Insight frequently writes stuff like, We would all like to know the direction of the market in advance. Good luck with that! Second best is planning what to look for and how to react.

Running the EVPA with this morning’s pop up in the opening price of the SPY, I get a predicted high for the day of 210.17 and a predicted low of 207.5. The predicted low for the day will be spot-on, if the current actual low for the trading range holds.

I can think of any number of arguments to the point that the stock market is basically not predictable, because unanticipated events constantly make an impact on prices. I think it would even be possible to invoke Goedel’s Theorem – you know, the one that uses meta-mathematics to show that every axiomatic system of complexity greater than a group is essentially incomplete. There are always new truths.

On the other hand, backtesting the EVPA – extreme value prediction algorithm – is opening up new vistas. I’m appreciative of helpful comments of and discussions with professionals in the finance and stock market investing field.

I strain every resource to develop backtests which are out-of-sample (OOS), and recently have found a way to predict closing prices with resources from the EVPA.

MOnthlyROISPYevpa

Great chart. The wider gold lines are the actual monthly ROI for the SPY, based on monthly closing prices. The blue line shows the OOS prediction of these closing prices, based on EVPA metrics. As you can see, the blue line predictions flat out miss or under-predict some developments in the closing prices. At the same time, in other cases, the EVPA predictions show uncanny accuracy, particularly in some of the big dips down.

Recognize this is something new. Rather than, say, predicting developments likely over a range of trading days – the high and low of a month, the  chart above shows predictions for stock prices at specific times, at the closing bell of the market the last trading day of each month.

I calculate the OOS R2 at 0.63 for the above series, which I understand is better than can be achieved with an autoregressive model for the closing prices and associated ROI’s.

I’ve also developed spreadsheets showing profits, after broker fees and short term capital gains taxes, from trading based on forecasts of the EVPA.

But, in addition to guidance for my personal trading, I’m interested in following out the implications of how much the historic prices predict about the prices to come.

How Did This Week’s Forecasts of QQQ, SPY, GE, and MSFT High Prices Do?

The following Table provides an update for this week’s forecasts of weekly highs for the securities currently being followed – QQQ, SPY, GE, and MSFT. Price forecasts and actual numbers are in US dollars.

TableMay8

This batch of forecasts performed extremely well in terms of absolute size of forecast errors, and, in addition, beating a “no change” forecast in three out of four predictions (exception being SPY) and correctly calling the change in direction of the high for QQQ.

It would be nice to be able to forecast the high prices for five-day-forward periods with the accuracy seen in the Microsoft (MSFT) forecast.

As all you market mavens know, US stock markets experienced a lot of declines in prices this week, so the highs for the week occurred Monday.

I’ve had several questions about the future direction of the market. Are declines going to be in the picture for the coming week, and even longer, for example?

I’ve been studying the capabilities of these algorithms to predict turning points in indexes and prices of individual securities. The answer is going to be probabilistic, and so is complicated. Sometimes the algorithm seems to provide pretty unambiguous signals as to turning points. In other instances, the tea leaves are harder to read, but, arguably, a signal does exist for most major turning points with the indexes I have focused on – SPY, QQQ, and the S&P 500.

So, the next question is – has the market hit a high for a week or a few weeks, or even perhaps a major turnaround?

Deploying these algorithms, coded in Visual Basic and C#, to attack this question is a little like moving a siege engine to the castle wall. A major undertaking.

I want to get there, but don’t want to be a “Chicken Little” saying “the sky is falling,” “the sky is falling.”

Stock Market Predictability

This little Monday morning exercise, which will be continued for the next several weeks, is providing evidence for the predictability of aspects of stock prices on a short term basis.

Once the basic facts are out there for everyone to see, a lot of questions arise. So what about new information? Surely yesterday’s open, high, low, and closing prices, along with similar information for previous days, do not encode an event like 9/11, or the revelation of massive accounting fraud with a stock issuing concern.

But apart from such surprises, I’m leaning to the notion that a lot more information about the general economy, company prospects and performance, and so forth are subtly embedded in the flow of price data.

I talked recently with an analyst who is applying methods from Kelly and Pruitt’s Market Expectations in the Cross Section of Present Values for wealth management clients. I hope to soon provide an “in-depth” on this type of applied stock market forecasting model, which focuses, incidentally, on stock market returns and dividends.

There is also some compelling research on the performance of momentum trading strategies which seems to indicate a higher level of predictability in stock prices than is commonly thought to exist.

Incidentally, in posting this slightly before the bell today, Friday, I am engaging in intra-day forecasting – betting that prices for these securities will stay below their earlier highs.

Portfolio Analysis

Greetings again. Took a deep dive into portfolio analysis for a colleague.

Portfolio analysis, of course, has been deeply influenced by Modern Portfolio Theory (MPT) and the work of Harry Markowitz and Robert Merton, to name a couple of the giants in this field.

Conventionally, investment risk is associated with the standard deviation of returns. So one might visualize the dispersion of actual returns for investments around expected returns, as in the following chart.

investmentrisk

Here, two investments have the same expected rate of return, but different standard deviations. Viewed in isolation, the green curve indicates the safer investment.

More directly relevant for portfolios are curves depicting the distribution of typical returns for stocks and bonds, which can be portrayed as follows.

stocksbonds

Now the classic portfolio is comprised of 60 percent stocks and 40 percent bonds.

Where would its expected return be? Well, the expected value of a sum of random variables is the sum of their expected values. There is an algebra of expectations to express this around the operator E(.). So we have E(.6S+.4B)=.6E(S)+.4E(B), since a constant multiplied into a random variable just shifts the expectation by that factor. Here, of course, S stands for “stocks” and B “bonds.”

Thus, the expected return for the classic 60/40 portfolio is less than the returns that could be expected from stocks alone.

But the benefit here is that the risks have been reduced, too.

Thus, the variance of the 60/40 portfolio usually is less than the variance of a portfolio composed strictly of stocks.

One of the ways this is true is when the correlation or covariation of stocks and bonds is negative, as it has been in many periods over the last century. Thus, high interest rates mean slow to negative economic growth, but can be associated with high returns on bonds.

Analytically, this is because the variance of the sum of two random variables is the sum of their variances, plus their covariance multiplied by a factor of 2.

Thus, algebra and probability facts underpin arguments for investment diversification. Pick investments which are not perfectly correlated in their reaction to events, and your chances of avoiding poor returns and disastrous losses can be improved.

Implementing MPF

When there are more than two assets, you need computational help to implement MPT portfolio allocations.

For a general discussion of developing optimal portfolios and the efficient frontier see http://faculty.washington.edu/ezivot/econ424/portfoliotheorymatrixslides.pdf

There are associated R programs and a guide to using Excel’s Solver with this University of Washington course.

Also see Package ‘Portfolio’.

These programs help you identify the minimum variance portfolio, based on a group of assets and histories of their returns. Also, it is possible to find the minimum variance combination from a designated group of assets which meet a target rate of return, if, in fact, that is feasible with the assets in question. You also can trace out the efficient frontier – combinations of assets mapped in a space of returns and variances. These assets in each case have expected returns on the curve and are minimum variance compared with all other combinations that generate that rate of return (from your designated group of assets).

One of the governing ideas is that this efficient frontier is something an individual investor might travel along as they age – going from higher risk portfolios when they are younger, to more secure, lower risk portfolios, as they age.

Issues

As someone who believes you don’t really know something until you can compute it, it interests me that there are computational issues with implementing MPT.

I find, for example, that the allocations are quite sensitive to small changes in expected returns, variances, and the underlying covariances.

One of the more intelligent, recent discussions with suggested “fixes” can be found in An Improved Estimation to Make Markowitz’s Portfolio Optimization Theory Users Friendly and Estimation Accurate with Application on the US Stock Market Investment.

The more fundamental issue, however, is that MPT appears to assume that stock returns are normally distributed, when everyone after Mandelbrot should know this is hardly the case.

Again, there is a vast literature, but a useful approach seems to be outlined in Modelling in the spirit of Markowitz portfolio theory in a non-Gaussian world. These authors use MPT algorithms as the start of a search for portfolios which minimize value-at-risk, instead of variances.

Finally, if you want to cool off and still stay on point, check out the 2014 Annual Report of Berkshire Hathaway, and, especially, the Chairman’s Letter. That’s Warren Buffett who has truly mastered an old American form which I believe used to be called “cracker barrel philosophy.” Good stuff.

The King Has No Clothes or Why There Is High Frequency Trading (HFT)

I often present at confabs where there are engineers with management or executive portfolios. You start the slides, but, beforehand, prepare for the tough question. Make sure the numbers in the tables add up and that round-off errors or simple typos do not creep in to mess things up.

To carry this on a bit, I recall a Hewlett Packard VP whose preoccupation during meetings was to fiddle with their calculator – which dates the story a little. In any case, the only thing that really interested them was to point out mistakes in the arithmetic. The idea is apparently that if you cannot do addition, why should anyone believe your more complex claims?

I’m bending this around to the theory of efficient markets and rational expectations, by the way.

And I’m playing the role of the engineer.

Rational Expectations

The theory of rational expectations dates at least to the work of Muth in the 1960’s, and is coupled with “efficient markets.”

Lim and Brooks explain market efficiency in – The Evolution of Stock Market Efficiency Over Time: A Survey of the Empirical Literature

The term ‘market efficiency’, formalized in the seminal review of Fama (1970), is generally referred to as the informational efficiency of financial markets which emphasizes the role of information in setting prices.. More specifically, the efficient markets hypothesis (EMH) defines an efficient market as one in which new information is quickly and correctly reflected in its current security price… the weak-form version….asserts that security prices fully reflect all information contained in the past price history of the market.

Lim and Brooks focus, among other things, on statistical tests for random walks in financial time series, noting this type of research is giving way to approaches highlighting adaptive expectations.

Proof US Stock Markets Are Not Efficient (or Maybe That HFT Saves the Concept)

I like to read mathematically grounded research, so I have looked a lot of the papers purporting to show that the hypothesis that stock prices are random walks cannot be rejected statistically.

But really there is a simple constructive proof that this literature is almost certainly wrong.

STEP 1: Grab the data. Download daily adjusted closing prices for the S&P 500 from some free site (e,g, Yahoo Finance). I did this again recently, collecting data back to 1990. Adjusted closing prices, of course, are based on closing prices for the trading day, adjusted for dividends and stock splits. Oh yeah, you may have to resort the data from oldest to newest, since a lot of sites present the newest data on top, originally.

Here’s a graph of the data, which should be very familiar by now.

adjCLPS&P

STEP 2: Create the relevant data structure. In the same spreadsheet, compute the trading-day-over-treading day growth in the adjusted closing price (ACP). Then, side-by-side with this growth rate of the ACP, create another series which, except for the first value, maps the growth in ACP for the previous trading day onto the growth of the ACP for any particular day. That gives you two columns of new data.

STEP 3: Run adaptive regressions. Most spreadsheet programs include an ordinary least squares (OLS) regression routine. Certainly, Excel does. In any case, you want to setup up a regression to predict the growth in the ACP, based on one trading lags in the growth of the ACP.

I did this, initially, to predict the growth in ACP for January 3, 2000, based on data extending back to January 3, 1990 – a total of 2528 trading days. Then, I estimated regressions going down for later dates with the same size time window of 2528 trading days.

The resulting “predictions” for the growth in ACP are out-of-sample, in the sense that each prediction stands outside the sample of historic data used to develop the regression parameters used to forecast it.

It needs to be said that these predictions for the growth of the adjusted closing price (ACP) are marginal, correctly predicting the sign of the ACP only about 53 percent of the time.

An interesting question, though, is whether these just barely predictive forecasts can be deployed in a successful trading model. Would a trading algorithm based on this autoregressive relationship beat the proverbial “buy-and-hold?”

So, for example, suppose we imagine that we can trade at closing each trading day, close enough to the actual closing prices.

Then, you get something like this, if you invest $100,000 at the beginning of 2000, and trade through last week. If the predicted growth in the ACP is positive, you buy at the previous day’s close. If not, you sell at the previous day’s close. For the Buy-and-Hold portfolio, you just invest the $100,000 January 3, 2000, and travel to Tahiti for 15 years or so.

BandHversusAR

So, as should be no surprise, the Buy-and-Hold strategy results in replicating the S&P 500 Index on a $100,000 base.

The trading strategy based on the simple first order autoregressive model, on the other hand, achieves more than twice these cumulative earnings.

Now I suppose you could say that all this was an accident, or that it was purely a matter of chance, distributed over more than 3,810 trading days. But I doubt it. After all, this trading interval 2000-2015 includes the worst economic crisis since before World War II.

Or you might claim that the profits from the simple AR trading strategy would be eaten up by transactions fees and taxes. On this point, there were 1,774 trades, for an average of $163 per trade. So, worst case, if trading costs $10 a transaction, and there is a tax rate of 40 percent, that leaves $156K over these 14-15 years in terms of take-away profit, or about $10,000 a year.

Where This May Go Wrong

This does sound like a paen to stock market investing – even “day-trading.”

What could go wrong?

Well, I assume here, of course, that exchange traded funds (ETF’s) tracking the S&P 500 can be bought and sold with the same tactics, as outlined here.

Beyond that, I don’t have access to the data currently (although I will soon), but I suspect high frequency trading (HFT) may stand in the way of realizing this marvelous investing strategy.

So remember you have to trade some small instant before market closing to implement this trading strategy. But that means you get into the turf of the high frequency traders. And, as previous posts here observe, all kinds of unusual things can happen in a blink of an eye, faster than any human response time.

So – a conjecture. I think that the choicest situations from the standpoint of this more or less macro interday perspective, may be precisely the places where you see huge spikes in the volume of HFT. This is a proposition that can be tested.

I also think something like this has to be appealed to in order to save the efficient markets hypothesis, or rational expectations. But in this case, it is not the rational expectations of human subjects, but the presumed rationality of algorithms and robots, as it were, which may be driving the market, when push comes to shove.

Top picture from CommSmart Global.

High Frequency Trading – 2

High Frequency Trading (HFT) occurs faster than human response times – often quoted as 750 milliseconds. It is machine or algorithmic trading, as Sean Gourley’s “High Frequency Trading and the New Algorithmic Ecosystem” highlights.

This is a useful introductory video.

It mentions Fixnetix’s field programmable array chip and new undersea cables designed to shave milliseconds off trading speeds from Europe to the US and elsewhere.

Also, Gourley refers to dark pool pinging, which tries to determine the state of large institutional orders by “sniffing them out” and using this knowledge to make (almost) risk-free arbitrage by trading on different exchanges in milliseconds or faster. Institutional investors using slower and not-so-smart algorithms lose.

Other HFT tractics include “quote stuffing”, “smoking”, and “spoofing.” Of these, stuffing may be the most damaging. It limits access of slower traders by submitting large numbers of orders and then canceling them very quickly. This leads to order congestion, which may create technical trouble and lagging quotes.

Smoking and spoofing strategies, on the other hand, try to manipulate other traders to participate in trading at unfavorable moments, such as just before the arrival of relevant news.

Here are some more useful links on this important development and the technological arms race that has unfolded around it.

Financial black swans driven by ultrafast machine ecology Key research on ultrafast black swan events

Nanosecond Trading Could Make Markets Go Haywire Excellent Wired article

High-Frequency Trading and Price Discovery

Defense of HFT on basis that HFTs’ trade (buy or sell) in the direction of permanent price changes and against transitory pricing errors creates benefits which outweigh adverse selection of HFT liquidity supplying (non-marketable) limit orders.

The Good, the Bad, and the Ugly of Automated High-Frequency Trading tries to strike a balance, but tilts toward a critique

Has HFT seen its heyday? I read at one and the same time I read at one and the same time that HFT profits per trade are dropping, that some High Frequency Trading companies report lower profits or are shutting their doors, but that 70 percent of the trades on the New York Stock Exchange are the result of high frequency trading.

My guess is that HFT is a force to be dealt with, and if financial regulators are put under restraint by the new US Congress, we may see exotic new forms flourishing in this area. 

High Frequency Trading and the Efficient Market Hypothesis

Working on a white paper about my recent findings, I stumbled on more confirmation of the decoupling of predictability and profitability in the market – the culprit being high frequency trading (HFT).

It makes a good story.

So I am looking for high quality stock data and came across the CalTech Quantitative Finance Group market data guide. They tout QuantQuote, which does look attractive, and was cited as the data source for – How And Why Kraft Surged 29% In 19 Seconds – on Seeking Alpha.

In early October 2012 (10/3/2012), shares of Kraft Foods Group, Inc surged to a high of $58.54 after opening at $45.36, and all in just 19.93 seconds. The Seeking Alpha post notes special circumstances, such as spinoff of Kraft Foods Group, Inc. (KRFT) from Modelez International, Inc., and addition of KRFT to the S&P500. Funds and ETF’s tracking the S&P500 then needed to hold KRFT, boosting prospects for KRFT’s price.

For 17 seconds and 229 milliseconds after opening October 3, 2012, the following situation, shown in the QuantQuote table, unfolded.

QuantQuote1

Times are given in milliseconds past midnight with the open at 34200000.

There is lots of information in this table – KRFT was not shortable (see the X in the Short Status column), and some trades were executed for dark pools of money, signified by the D in the Exch column.

In any case, things spin out of control a few milliseconds later, in ways and for reasons illustrated with further QuantQuote screen shots.

The moral –

So how do traders compete in a marketplace full of computers? The answer, ironically enough, is to not compete. Unless you are prepared to pay for a low latency feed and write software to react to market movements on the millisecond timescale, you simply will not win. As aptly shown by the QuantQuote tick data…, the required reaction time is on the order of 10 milliseconds. You could be the fastest human trader in the world chasing that spike, but 100% of the time, the computer will beat you to it.

CNN’s Watch high-speed trading in action is a good companion piece to the Seeking Alpha post.

HFT trading has grown by leaps and bounds, but estimates vary – partly because NASDAQ provides the only Datasets to academic researchers that directly classify HFT activity in U.S. equities. Even these do not provide complete coverage, excluding firms that also act as brokers for customers.

Still, the Security and Exchange Commission (SEC) 2014 Literature Review cites research showing that HFT accounted for about 70 percent of NASDAQ trades by dollar volume.

And associated with HFT are shorter holding times for stocks, now reputed to be as low as 22 seconds, although Barry Ritholz contests this sort of estimate.

Felix Salmon provides a list of the “evils” of HFT, suggesting a small transactions tax might mitigate many of these,

But my basic point is that the efficient market hypothesis (EMH) has been warped by technology.

I am leaning to the view that the stock market is predictable in broad outline.

But this predictability does not guarantee profitability. It really depends on how you handle entering the market to take or close out a position.

As Michael Lewis shows in Flash Boys, HFT can trump traders’ ability to make a profit

Further Research into Predicting Daily and Other Period High and Low Stock Prices

The Internet is an amazing scientific tool. Communication of results is much faster, although, of course, with, potentially, dreck and misinformation. At the same time, pressures within the academy and Big Science seem to translate into a shocking amount of bogus research being touted. So maybe this free-for-all on the Web is where it’s at, if you are trying to get up to speed on new findings.

So this post today seeks to nail down some further and key points about predicting the high and low of stocks over various periods – conventionally, daily, weekly, and monthly periods, but also, as I have discovered, highs and lows over consecutive blocks of trading days ranging from 1 to 60 days, and probably more.

My recent posts focus on the SPY exchange traded fund, which tracks the S&P 500.

Yesterday, I formulated my general findings as follows:

For every period from daily periods to 60 day periods I have investigated, the high and low prices are “relatively” predictable and the direction of change from period to period is predictable, in backcasting analysis, about 70-80 percent of the time, on average.

In this post, let me show you the same basic relationship for a common stock – Ford Motor stock (F). I also consider data from the 1970’s, as well as recent data, to underline that modern program or high-speed computer-based algorithms have nothing to do with the underlying pattern.

I also show that the predictive model for the high in a period successfully captures turning points in the stock price in the 1970’s and more recently for 2008-2009.

Approach

Yahoo Finance, my free source of daily trading data, has history for Ford Motor stock dating back to June 1, 1972, charted as follows.

Ford

Now, the predictive models for the daily high and low stock price are formulated, as before, keying off the opening price in each trading day. One of the key relationships is the proximity of the daily opening price to the previous period high. The other key relationship is the proximity of the daily opening price to the previous period low. Ordinary least squares (OLS) regression models can be developed which do a good job of predicting the direction of change of the daily high and low, based on knowledge of the opening price for the day.

Predicting the Direction of Change of the High

As before, these models make correct predictions regarding the directions of change of the high and low about 70 percent of the time.

Here are 30 period moving averages for the 1970’s, showing the proportions of time the predictive model for the daily high is right about the direction of change.

MAFord

So the underlying relationship definitely holds in this age in which computer modeling of trading was in its infancy.

Here is a similar chart for the first decade of this century.

MAFordrecent

So whether we are considering the 1970’s or the last ten years, these predictive models do well in forecasting the direction of change of the high in daily (and it turns out other) periods.

Predicting Turning Points

We can make the same type of comparison – between the 1970’s and more recent years – for the capability of the predictive models to forecast turning points in the stock high (or low).

To do this usually requires aggregating the stock data. In the charts below, I aggregate to 7 trading day periods – not quite the same as weekly periods, since weekly segmentation can be short a day and so forth.

So the high which the predictive model focuses on is the high for the coming seven trading days, given the current day opening price.

Here are two charts, one for dates in the 1970’s and the other for a period in the recession of 2008-2009. For each chart I estimate OLS regressions with data predating each forecast of the high, based on blocks of 7 trading days.

70'sTP

These predictions of the high crisply capture most of the important turning and inflection point features.

recentTP

The application of similar predictive models for the 2008-2009 period is a little choppier, but does nail many of the important swings in the direction of change of the high of Ford Motor stock.

Concluding Thoughts

Well, this relationship between the opening prices and previous period highs and lows is highly predictive of the direction of change of the highs and lows in the current period – which can be a span of time from a day to 60 days in my findings.

These predictive models work for the S&P 500 and for individual stocks, like Ford Motor (and I might add Exxon and Microsoft).

They work in recent time periods and way back in the 1970’s.

And there’s more – for example, one could argue these patterns in the high and low prices are fractal, in the sense they represent “self similarity” at all (really many or a range of) time scales.

This is literally a new and fundamental regularity in stock prices.

Why does this work?

Well, the predictive models are closely related to very simple momentum trading strategies. But I think there is a lot of research to be done here. If you want further detail on any of this, please put your request in the Comments with the heading “Request for High/Low Model Information.”

Top picture from Strategic Monk.

Predicting the High of SPY Over Daily, Weekly, and Monthly Forecast Horizons

Here are some remarkable findings relating to predicting the high and low prices of the SPDR S&P 500 ETF (SPY) in daily, weekly, and monthly periods.

Basically, the high and low prices for SPY can be forecast with some accuracy – especially with regards the sign of the percent change from the high or low of the previous period.

The simplicity of the predictive relationships are remarkable, and key off the ratio of the previous period high or low to the opening price for the new period under consideration. There is precedent in the work of George and Hwang, for example, who show picking portfolios of stocks whose price is near their 52-week high can generate superior returns (validated in 2010 for international portfolios). But my analysis concerns a specific exchange traded fund (ETF) which, of course, mirrors the S&P 500 Index.

Evidence

For data, I utilize daily, weekly, and monthly open, close, high, low, and volume data on the SPDR S&P 500 ETF SPY from Yahoo Finance from January 1993 to the present.

I estimate ordinary least squares (OLS) regression estimates on a rolling or adaptive basis.

So, for example, I begin weekly estimates to predict the high for a forecast horizon of one week on the period February 1, 1993 to December 12, 1994. The dependent variable is the growth in the highs from week to week – 97 observations on weekly data to begin with.

The initial regression has a coefficient of determination of 0.405 and indicates high statistical significance for the regression coefficients – although the underlying stochastic elements here are probably profoundly non-normal.

I use a similar setup to predict the weekly low of SPY, substituting the “growth” of the preceding low (in the previous week) to the current opening price in the set of explanatory variables. I continue using the lagged logarithm of the trading volume.

This chart shows the proportion of correct signs predicted by weekly models for the growth or percentage changes in the high and low prices in terms of 30 week moving averages (click to enlarge).

weeklycomp

There is a lot to think about in this chart, clearly.

The basic truth, however, is that the predictive models, which are simple OLS regressions with two explanatory variables, predict the correct sign of the growth weekly percentage changes in the high and low SPY prices about 75 percent of the time.

Similar analysis of monthly data also leads to predictive models for the monthly high and lows. The predictive models for the high and low prices in monthly forecast horizons correctly predict more than 70 percent of the directions of change in these respective growth rates, with the model for the lows being more powerful statistically.

The actual forecasts of the growth in the monthly highs and lows may be helpful in discerning turning points in the SPY and, thus, the S&P 500, as the following chart suggests.

Bounded

Here I apply the predicted high and low growth rates week-by-week to the previous week values for the high and low and also chart the SPY closing prices for the week (in bold red).

For discussion of the models for the daily highs and lows, see my previous blog posts here and here.

I might add that these findings relating to predicatability of the high and low of SPY on a daily, weekly, and monthly basis are among the strongest and simplest statistical relationships I have had the fortune to encounter.

Academic researchers are free to use and build on these results, but I would appreciate being credited with the underlying insight or as at least a source.

Discussion – Pathways of Predictability

Since this is not a refereed publication, I take the liberty of offering some conjectures on why this predictability exists.

My basic idea is that there are positive feedback loops for investing, based on fairly simple predictive models for the high of SPY that will be reached over a day, a week, or a month. So this would mean investors are aware of this relationship, and act upon it in real time. Their actions, furthermore, reinforce the strength of the relationship, creating pathways of predictability into the future in otherwise highly volatile, noisy data. Discovery of such pathways serves to reinforce their existence.

If this is true, it is real news and something relatively novel in economic forecasting.

And there is a second conjecture. I suspect that these “pathways of predictability” in the high and probably the low of SPY give us a window into turning points in the underlying stock index, the S&P 500. It should be possible to array daily, weekly, and monthly forecasts of the highs and lows for SPY and get some indication of a change in the direction of movement of the series.

These are a big claims, and eventually, may become shaded in colors of lighter and darker grey. However, I believe they work well as research hypotheses.

Revisiting the Predictability of the S&P 500

Almost exactly a year ago, I posted on an algorithm and associated trading model for the S&P 500, the stock index which supports the SPY exchange traded fund.

I wrote up an autoregressive (AR) model, using daily returns for the S&P 500 from 1993 to early 2008. This AR model outperforms a buy-and-hold strategy for the period 2008-2013, as the following chart shows.

SPYTradingProgramcompBH

The trading algorithm involves “buying the S&P 500” when the closing price indicates a positive return for the following trading day. Then, I “close out the investment” the next trading day at that day’s closing price. Otherwise, I stay in cash.

It’s important to be your own worst critic, and, along those lines, I’ve had the following thoughts.

First, the above graph disregards trading costs. Your broker would have to be pretty forgiving to execute 2000-3000 trades for less than the $500 you make over the buy-and-hold strategy. SO, I should deduct something for the trades in calculating the cumulative value.

The other criticism concerns high frequency trading. The daily returns are calculated against closing values, but, of course, to use this trading system you have to trade prior to closing. However, even a few seconds can make a crucial difference in the price of the S&P 500 or SPY – and even smaller intervals.

An Up-Dated AR Model

Taking some of these criticisms into account, I re-estimate an autoregressive model on more recent data –again calculating returns against closing prices on successive trading days.

This time I start with an initial investment of $100,000, and deduct $5 per trade off the totals as they cumulate.

I also utilize only seven (7) lags for the daily returns. This compares with the 30 lag model from the post a year ago, and I estimate the current model with OLS, rather than maximum likelihood.

The model is

Rt = 0.0007-0.0651Rt-1+0.0486Rt-2-0.0999Rt-3-0.0128Rt-4-0.1256Rt-5 +0.0063Rt-6-0.0140Rt-7

where Rt is the daily return for trading day t. This model originates on data from June 11, 2011. The coefficients of the equation result from bagging OLS regressions – developing coefficient estimates for 100,000 similar size samples drawn with replacement from this dataset of 809 observations. These 100,000 coefficient estimates are averaged to arrive at the numbers shown above.

Here is the result of applying my revised model to recent stock market activity. The results are out-of-sample. In other words, I use the predictive equation which is calculated over data prior to the start of the investment comparison. I also filter the positive predictions for the next day closing price, only acting when they are a certain size or larger.

NewARmodel

There is a 2-3 percent return on a hundred thousand dollar investment in one month, and a projected annual return on the order of 20-30 percent.

The current model also correctly predicts the sign of the daily return 58 percent of the time, compared with a much lower figure for the model from a year ago.

This looks like the best thing since sliced bread.

But wait – what about high frequency trading?

I’m exploring the implementation of this model – and maybe should never make it public.

But let me clue you in on what I suspect, and some evidence I have.

So, first, it is interesting the gains from trading on closing day prices more than evaporate by the opening of the New York Stock Exchange, following the generation of a “buy” signal according to this algorithm.

In other words, if you adjust the trading model to buy at the open of the following trading day, when the closing price indicates a positive return for the following day – you do not beat a buy-and-hold strategy. Something happens between the closing and the opening of the NYSE market for the SPY.

Someone else knows about this model?

I’m exploring the “final second’ volatility of the market, focusing on trading days when the closing prices look like they might come in to indicate a positive return the following day. This is complicated, and it puts me into issues of predictability in high frequency data.

I also am looking at the SPY numbers specifically to bring this discussion closer to trading reality.

Bottom line – It’s hard to make money in the market on trading algorithms if you are a day-trader – although probably easier with a super-computer at your command and when you sit within microseconds of executing an order on the NY Stock Exchange.

But these researches serve to indicate one thing fairly clearly. And that is that there definitely are aspects of stock prices which are predictable. Acting on the predictions is the hard part.

And Postscript: Readers may have noticed a lesser frequency of posting on Business Forecast blog in the past week or so. I am spending time running estimations and refreshing and extending my understanding of some newer techniques. Keep checking in – there is rapid development in “real world forecasting” – exciting and whiz bang stuff. I need to actually compute the algorithms to gain a good understanding – and that is proving time-consuming. There is cool stuff in the data warehouse though.