What’s Going On?

Teaching economics during Vietnam and, later, the onset of Reagan – I developed a sort of sideline patter about current events. Later, I realized this bore resemblance to a kind of global system dynamics.

Then, my consulting made these considerations more relevant – to the point that, in recent years, I make correlations between what you might call a global regional analysis and sales prospects, as well as corporate strategy.

How do you go about developing this perspective? The question is especially relevant for me now, since I am emerging from a deep dive into hands-on statistical modeling.

Well, one way to visualize this is as a series of threads through time. Each of these threads is strung with events that can turn out one way or another. There are main threads as believed to be constituted by “serious people.” The conventional view of things, if you will. There also are many outliers, story lines which incorporate unusual, perhaps foreboding developments. I guess you could think of these threads as scenarios, too. A whole bunch of movie scripts about how the future is going to unfold.

Now before getting into specifics, let me make what might be considered an obscure remark, but one relevant to forecasting. What you want to do is disentangle and identify as many of these threads as you have the energy to consider, and then, watch for convergences. If there are several ways, in other words, for some events to become manifested, these events become more likely.

One of the things this methodology accommodates is a fact that it seems to me that many people overlook or downplay. This is that there can be really fundamental differences between how different groups of people, perhaps with different interests or things to gain or lose out of situations, look at things.

One of the clearest examples, perceptually, is the arrow illusion.


So this is one reason why I try to glean perspectives from all over – including heterodox and contrarian views.

Noone at this point can convince me this is not a good practice, even though it may make those who busy themselves with thought control (“reality construction”) uncomfortable.

For example, many years ago, I was sitting at my father’s breakfast nook glancing at some books he had recently bought, and I found Andrei Amalrik’s Will the Soviet Union Survive Until 1984? What a preposterous idea, it seemed to me. Collapse of the Soviet Union.

It pays to look at heterodox views, even if only a few of these will have any relevance to the future.

Some Specifics

Well, today we have the internet – a font of views of all types.

In thinking about developing this and its successors on the same or similar topics this morning, I first turned to Zero Hedge. From Wikipedia,

Zero Hedge is a financial blog that aggregates news and presents editorial opinions from original and outside sources. It has been described as offering a “deeply conspiratorial, anti-establishment and pessimistic view of the world”… It reports on economics, Wall Street, and the financial sector and is credited with bringing the controversial practice of flash trading to public attention in 2009 via a series of posts alleging that Goldman Sachs’ access to flash order information allowed it to gain unfair profits. The news portion of the site is written by a group of editors who collectively write under the pseudonym “Tyler Durden”, a character from the novel and film Fight Club.

Since I have been out of the loop for a while, the litany of shocking or bad news on this site does not bother me yet.

Some of the headings include:

Iran Forces Seize US Cargo Ship With 34 People On Board, Al Arabiya Reports

West Baltimore In Ashes: A Night Of Violence And Looting In Photos

Stocks Soar On Non-War, Bad-News-Is-Good-News V-Shaped Recovery

Well, I’m not sure what to make of all that. Conflict is increasing. War and riot memes.

Another site I frequently turn to, quite frankly, is Naked Capitalism, and, in particular, Links assembled by “Yves Smith” and others. Today, these range over topics like the Greek-European Union negotiations and the threat of an exit of Greece from the Eurozone, the TPP (trans-Pacific Partnership secret trade bill), Yemen and Syria, and a reference to a new and important report from MIT about the decline in US science spending –The Future Postponed.

I also consult what I would call “libertarian” financial blogs such as Mish Shedlock’s Global Economic Trend Analysis.

Then, I guess, after surveying these “oppositional views,” I turn to official forecasts and publications of US and European banks and financial institutions, as well as central banks.

I’ve given play to JP Morgan forecasters here, as well as Bloomberg’s list of leading macroeconomic forecasters.  It is always good to try to keep tabs on the latest sayings of these celebrity forecasters.

The Bank of England Financial Stability Report, most recently issued December 2014, is a relevant publication.

I also tend to look at, but basically discount, sources such as the Survey of Professional Forecasters, assembled by the Philadelphia Federal Reserve Bank. The record of macroeconomic forecasting is truly abysmal. But, apart from turning points, there may be value in tracking the projected movement of indicators and their trends.

The Central Issue

I have not mentioned slowing of the Chinese economy in the above discussion or several other megatrends, but let me move on to a key pivot for the next few years.

Business expansions never last forever. The current expansion, perhaps because it began so slowly, has sustained for a relatively long time already.

Another key point is that many central banks have pushed interest rates to near the zero bound, and they remain historically very low.

Frankly, it challenges my capabilities to imagine a future in which interest rates sort of disappear as key economic factors – although this may be a thread we need to consider. The attack on cash and movement to purely electronic money could be part of this, with negative interest rates entering the picture in a real way.

But assuming that does not happen, central banks will have to encourage higher interest rates, and that will have wide-ranging effects on business, it seems certain. There are many tangible forecasting problems associated with this prospective development.

I have to believe this is the central issue at present. How can the US Federal Reserve, for example, move off the zero bound for the federal funds rate, when the US economic recovery should, according to historical patterns, be moving toward its final months or years?

There are other tough issues – in the Middle East, the Ukraine, climate change, and so forth – but, as an economic or business forecaster, I have to believe this tension between normal banking practice and the business cycle is fundamental.

In any case, I want to return to putting up business forecasts, including longer term scenarios, in addition to carrying forth with my stock market forecasting experiment.

Weekly BusinessForecastBlog Stock Price Forecasts – QQQ, SPY, GE

Here are forecasts of the weekly high price for three securities. These include intensely traded exchange traded funds (ETF’s) and a blue chip stock – QQQ, SPY, and GE.


The table also shows the track record so far.

All the numbers not explicitly indicated as percents are in US dollars.

These forecasts come with disclaimers. They are presented purely for scientific and informational purposes. This blog takes no responsibility for any investment gains or losses that might be linked with these forecasts. Invest at your own risk.

So having said that, some implications and background information.

First of all, it looks like it’s off to the races for the market as a whole this week, although possibly not for GE. The highs for the ETF’s all show solid gains.

Note, too, that these are forecasts of the high price which will be reached over the next five trading days, Monday through Friday of this week.

Key features of the method are now available in a white paper published under the auspices of the University of Munich – Predictability of the daily high and low of the S&P 500 index. This research shows that the so-called proximity variables achieve higher accuracies in predicting the daily high and low prices for the S&P 500 than do benchmark approaches, such as the no-change forecast and forecasts from an autoregressive model.

Again, caution is advised in making direct application of the methods in the white paper to the current problem –forecasting the high for a five day trading period. There have been many modifications.

That’s, of course, one reason for the public announcements of forecasts from the NPV (new proximity variable) model.

Go real-time, I’ve been advised. It makes the best case, or at least exposes the results to the light of day.

Based on backtesting, I expect forecasts for GE to be less accurate than those for QQQ and SPY. In terms of mean absolute percent error (MAPE), we are talking around 1% for QQQ and SPY and, maybe, 1.7% for GE.

The most reliable element of these forecasts are the indicated directions of change from the previous period highs.

Features and Implications

There are other several other features which are reliably predicted by the NPV models. For example, forecasts for the low price or even closing prices on Friday can be added – although closing prices are less reliable. Obviously, too, volatility metrics are implied by predictions of the high and low prices.

These five-trading day forecasts parallel the results for daily periods documented in the above-cited white paper. That is, the NPV forecast accuracy for these securities in each case beats “no-change” and autoregressive model forecasts.

Focusing on stock market forecasts has “kept me out of trouble” recently. I’m focused on quantitative modeling, and am not paying a lot of attention to global developments – such as the ever- impending Greek default or, possibly, exit from the euro. Other juicy topics include signs of slowing in the global economy, and the impact of armed conflict on the Arabian Peninsula on the global price of oil. These are great topics, but beyond hearsay or personal critique, it is hard to pin things down just now.

So, indeed, I may miss some huge external event which tips this frothy stock market into reverse – but, at the same time, I assure you, once a turning point from some external disaster takes place, the NPV models should do a good job of predicting the extent and duration of such a decline.

On a more optimistic note, my research shows the horizons for which the NPV approach applies and does a better job than the benchmark models. I have, for example, produced backtests for quarterly SPY data, demonstrating continuing superiority of the NPV method.

My guess – and I would be interested in validating this – is that the NPV approach connects with dominant trader practice. Maybe stock market prices are, in some sense, a random walk. But the reactions of traders to daily price movements create short term order out of randomness. And this order can emerge and persist for relatively long periods. And, not only that, but the NPV approach is linked with self-reinforcing tendencies, so that awareness may just make predicted effects more pronounced. That is, if I tell you the high price of a security is going up over the coming period, your natural reaction is to buy in – thus reinforcing the prediction. And the prediction is not just public relations stunt or fluff. The first prediction is algorithmic, rather than wishful and manipulative. Thus, the direction of change is more predictable than the precise extent of price change.

In any case, we will see over coming weeks how well these models do.

Some Comments on Forecasting High and Low Stock Prices

I want to pay homage to Paul Erdős, the eccentric Hungarian-British-American-Israeli mathematician, whom I saw lecture a few years before his death. Erdős kept producing work in mathematics into his 70’s and 80’s – showing this is quite possible. Of course, he took amphetamines and slept on people’s couches while he was doing this work in combinatorics, number theory, and probability.


In any case, having invoked Erdős, let me offer comments on forecasting high and low stock prices – a topic which seems to be terra incognita, for the most part, to financial research.

First, let’s take a quick look at a chart showing the maximum prices reached by the exchange traded fund QQQ over a critical period during the last major financial crisis in 2008-2009.


The graph charts five series representing QQQ high prices over periods extending from 1 day to 40 days.

The first thing to notice is that the variability of these time series decreases as the period for the high increases.

This suggests that forecasting the 40 day high could be easier than forecasting the high price for, say, tomorrow.

While this may be true in some sense, I want to point out that my research is really concerned with a slightly different problem.

This is forecasting ahead by the interval for the maximum prices. So, rather than a one-day-ahead forecast of the 40 day high price (which would include 39 known possible high prices), I forecast the high price which will be reached over the next 40 days.

This problem is better represented by the following chart.


This chart shows the high prices for QQQ over periods ranging from 1 to 40 days, sampled at what you might call “40 day frequencies.”

Now I am not quite going to 40 trading day ahead forecasts yet, but here are results for backtests of the algorithm which produces 20-trading-day-ahead predictions of the high for QQQ.


The blue lines shows the predictions for the QQQ high, and the orange line indicates the actual QQQ highs for these (non-overlapping) 20 trading day intervals. As you can see, the absolute percent errors – the grey bars – are almost all less than 1 percent error.

Random Walk

Now, these results are pretty good, and the question arises – what about the random walk hypothesis for stock prices?

Recall that a simple random walk can be expressed by the equation xt=xt-1 + εt where εt is conventionally assumed to be distributed according to N(0,σ) or, in other words, as a normal distribution with zero mean and constant variance σ.

An interesting question is whether the maximum prices for a stock whose prices follow a random walk also can be described, mathematically, as a random walk.

This is elementary, when we consider that any two observations in a time series of random walks can be connected together as xt+k = xt + ω where ω is distributed according to a Gaussian distribution but does not necessarily have a constant variance for different values of the spacing parameter k.

From this it follows that the methods producing these predictions or forecasts of the high of QQQ over periods of several trading days also are strong evidence against the underlying QQQ series being a random walk, even one with heteroskedastic errors.

That is, I believe the predictability demonstrated for these series are more than cointegration relationships.

Where This is Going

While demonstrating the above point could really rock the foundations of finance theory, I’m more interested, for the moment, in exploring the extent of what you can do with these methods.

Very soon I’m going to post on how these methods may provide signals as to turning points in stock market prices.

Stay tuned, and thanks for your comments and questions.

Erdős picture from Encyclopaedia Britannica

Update and Extension – Weekly Forecasts of QQQ and Other ETF’s

Well, the first official forecast rolled out for QQQ last week.

It did relatively well. Applying methods I have been developing for the past several months, I predicted the weekly high for QQQ last week at 108.98.

In fact, the high price for QQQ for the week was 108.38, reached Monday, April 13.

This means the forecast error in percent terms was 0.55%.

It’s possible to look more comprehensively at the likely forecast errors with my approach with backtesting.

Here is a chart showing backtests for the “proximity variable method” for the QQQ high price for five day trading periods since the beginning of 2015.


The red bars are errors, and, from their axis on the right, you can see most of these are below 0.5%.

This is encouraging, and there are several adjustments which may improve forecasting performance beyond this level of accuracy I want to explore.

So here is the forecast of the high prices that will be reached by QQQ and SPY for the week of April 20-24.


As you can see, I’ve added SPY, an ETF tracking the S&P500.

I put this up on Businessforecastblog because I seek to make a point – namely, that I believe methods I have developed can produce much more accurate forecasts of stock prices.

It’s often easier and more compelling to apply forecasting methods and show results, than it is to prove theoretically or otherwise argue that a forecasting method is worth its salt.

Disclaimer –  These forecasts are for informational purposes only. If you make investments based on these numbers, it is strictly your responsibility. Businessforecastblog is not responsible or liable for any potential losses investors may experience in their use of any forecasts presented in this blog.

Well, I am working on several stock forecasts to add to projections for these ETF’s – so will expand this feature in forthcoming Mondays.

Predicting the High Reached by the SPY ETF 30 Days in Advance – Some Results

Here are some backtests of my new stock market forecasting procedures.

Here, for example, is a chart showing the performance of what I call the “proximity variable approach” in predicting the high price of the exchange traded fund SPY over 30 day forward periods (click to enlarge).


So let’s be clear what the chart shows.

The proximity variable approach- which so far I have been abbreviating as “PVar” – is able to identify the high prices reached by the SPY in the coming 30 trading days with forecast errors mostly under 5 percent. In fact, the MAPE for this approximately ten year period is 3 percent. The percent errors, of course, are charted in red with their metric on the axis to the right.

The blue line traces out the predictions, and the grey line shows the actual highs by 30 trading day period.

These results far surpass what can be produced by benchmark models, such as the workhorse No Change model, or autoregressive models.

Why not just do this month-by-month?

Well, months have varying numbers of trading days, and I have found I can boost accuracy by stabilizing the number of trading days considered in the algorithm.


Realize, of course, that a prediction of the high price that a stock or ETF will reach in a coming period does not tell you when the high will be reached – so it does not immediately translate to trading profits. The high in question could come with the opening price of the period, for example, leaving you out of the money, if you hear there is this big positive prediction of growth and then jump in the market.

However, I do think that market participants react to anticipated increases or decreases in the high or low of a security.

You might explain these results as follows. Traders react to fairly simple metrics predicting the high price which will be reached in the next period – and let this concept be extensible from a day to a month in this discussion. In so reacting, these traders tend to make such predictive models self-fulfilling.

Therefore, daily prices – the opening, the high, the low, and the closing prices – encode a lot more information about trader responses than is commonly given in the literature on stock market forecasting.

Of course, increasingly, scholars and experts are chipping away at the “efficient market hypothesis” and showing various ways in which stock market prices are predictable, or embody an element of predictability.

However, combing Google Scholar and other sources, it seems almost no one has taken the path to modeling stock market prices I am developing here. The focus in the literature is on closing prices and daily returns, for example, rather than high and low prices.

I can envision a whole research program organized around this proximity variable approach, and am drawn to taking this on, reporting various results on this blog.

If any readers would like to join with me in this endeavor, or if you know of resources which would be available to support such a project – feel free to contact me via the Comments and indicate, if you wish, whether you want your communication to be private.

Let’s Get Real Here – QQQ Stock Price Forecast for Week of April 13-17

The thing I like about forecasting is that it is operational, rather than merely theoretical. Of course, you are always wrong, but the issue is “how wrong?” How close do the forecasts come to the actuals?

I have been toiling away developing methods to forecast stock market prices. Through an accident of fortune, I have come on an approach which predicts stock prices more accurately than thought possible.

After spending hundreds of hours over several months, I am ready to move beyond “backtesting” to provide forward-looking forecasts of key stocks, stock indexes, and exchange traded funds.

For starters, I’ve been looking at QQQ, the PowerShares QQQ Trust, Series 1.

Invesco describes this exchange traded fund (ETF) as follows:

PowerShares QQQ™, formerly known as “QQQ” or the “NASDAQ- 100 Index Tracking Stock®”, is an exchange-traded fund based on the Nasdaq-100 Index®. The Fund will, under most circumstances, consist of all of stocks in the Index. The Index includes 100 of the largest domestic and international nonfinancial companies listed on the Nasdaq Stock Market based on market capitalization. The Fund and the Index are rebalanced quarterly and reconstituted annually.

This means, of course, that QQQ has been tracking some of the most dynamic elements of the US economy, since its inception in 1999.

In any case, here is my forecast, along with tracking information on the performance of my model since late January of this year.


The time of this blog post is the morning of April 13, 2015.

My algorithms indicate that the high for QQQ this week will be around $109 or, more precisely, $108.99.

So this is, in essence, a five day forecast, since this high price can occur in any of the trading days of this week.

The chart above shows backtests for the algorithm for ten weeks. The forecast errors are all less than 0.65% over this history with a mean absolute percent error (MAPE) of 0.34%.

So that’s what I have today, and count on succeeding installments looking back and forward at the beginning of the next several weeks (Monday), insofar as my travel schedule allows this.

Also, my initial comments on this post appear to offer a dig against theory, but that would be unfair, really, since “theory” – at least the theory of new forecasting techniques and procedures – has been very important in my developing these algorithms. I have looked at residuals more or less as a gold miner examines the chat in his pan. I have considered issues related to the underlying distribution of stock prices and stock returns – NOTE TO THE UNINITIATED – STOCK PRICES ARE NOT NORMALLY DISTRIBUTED. There is indeed almost nothing about stocks or stock returns which is related to the normal probability distribution, and I think this has been a huge failing of conventional finance, the Black Scholes Theorem, and the like.

So theory is important. But you can’t stop there.

This should be interesting. Stay tuned. I will add other securities in coming weeks, and provide updates of QQQ forecasts.

Readers interested in the underlying methods can track back on previous blog posts (for example, Pvar Models for Forecasting Stock Prices or Time-Varying Coefficients and the Risk Environment for Investing).


Blogging gets to be enjoyable, although demanding. It’s a great way to stay in touch, and probably heightens personal mental awareness, if you do it enough.

The “Business Forecasting” focus allows for great breadth, but may come with political constraints.

On this latter point, I assume people have to make a living. Populations cannot just spend all their time in mass rallies, and in political protests – although that really becomes dominant at certain crisis points. We have not reached one of those for a long time in the US, although there have been mobilizations throughout the Mid-East and North Africa recently.

Nate Silver brought forth the “hedgehog and fox” parable in his best seller – The Signal and the Noise. “The fox knows many things, but the hedgehog knows one big thing.”

My view is that business and other forecasting endeavors should be “fox-like” – drawing on many sources, including, but not limited to quantitative modeling.

What I Think Is Happening – Big Picture

Global dynamics often are directly related to business performance, particularly for multinationals.

And global dynamics usually are discussed by regions – Europe, North America, Asia-Pacific, South Asia, the Mid-east, South American, Africa.

The big story since around 2000 has been the emergence of the People’s Republic of China as a global player. You really can’t project the global economy without a fairly detailed understanding of what’s going on in China, the home of around 1.5 billion persons (not the official number).

Without delving much into detail, I think it is clear that a multi-centric world is emerging. Growth rates of China and India far surpass those of the United States and certainly of Europe – where many countries, especially those along the southern or outer rim – are mired in high unemployment, deflation, and negative growth since just after the financial crisis of 2008-2009.

The “old core” countries of Western Europe, the United States, Canada, and, really now, Japan are moving into a “post-industrial” world, as manufacturing jobs are outsourced to lower wage areas.

Layered on top of and providing support for out-sourcing, not only of manufacturing but also skilled professional tasks like computer programming, is an increasingly top-heavy edifice of finance.

Clearly, “the West” could not continue its pre-World War II monopoly of science and technology (Japan being in the pack here somewhere). Knowledge had to diffuse globally.

With the GATT (General Agreement on Tariffs and Trade) and the creation of the World Trade Organization (WTO) the volume of trade expanded with reduction on tariffs and other barriers (1980’s, 1990’s, early 2000’s).

In the United States the urban landscape became littered with “Big Box stores” offering shelves full of clothing, electronics, and other stuff delivered to the US in the large shipping containers you see stacked hundreds of feet high at major ports, like San Francisco or Los Angeles.

There is, indeed, a kind of “hollowing out” of the American industrial machine.

Possibly it’s only the US effort to maintain a defense establishment second-to-none and of an order of magnitude larger than anyone elses’ that sustains certain industrial activities shore-side. And even that is problematical, since the chain of contracting out can be complex and difficult and costly to follow, if you are a US regulator.

I’m a big fan of post-War Japan, in the sense that I strongly endorse the kinds of evaluations and decisions made by the Japanese Ministry of International Trade and Investment (MITI) in the decades following World War II. Of course, a nation whose industries and even standing structures lay in ruins has an opportunity to rebuild from the ground up.

In any case, sticking to a current focus, I see opportunities in the US, if the political will could be found. I refer here to the opportunity for infrastructure investment to replace aging bridges, schools, seaport and airport facilities.

In case you had not noticed, interest rates are almost zero. Issuing bonds to finance infrastructure could not face more favorable terms.

Another option, in my mind – and a hat-tip to the fearsome Walt Rostow for this kind of thinking – is for the US to concentrate its resources into medicine and medical care. Already, about one quarter of all spending in the US goes to health care and related activities. There are leading pharma and biotech companies, and still a highly developed system of biomedical research facilities affiliated with universities and medical schools – although the various “austerities” of recent years are taking their toll.

So, instead of pouring money down a rathole of chasing errant misfits in the deserts of the Middle East, why not redirect resources to amplify the medical industry in the US? Hospitals, after all, draw employees from all socioeconomic groups and all ethnicities. The US and other national populations are aging, and will want and need additional medical care. If the world could turn to the US for leading edge medical treatment, that in itself could be a kind of foreign policy, for those interested in maintaining US international dominance.

Tangential Forces

While writing in this vein, I might as well offer my underlying theory of social and economic change. It is that major change occurs primarily through the impact of tangential forces, things not fully seen or anticipated. Perhaps the only certainty about the future is that there will be surprises.

Quite a few others subscribe to this theory, and the cottage industry in alarming predictions of improbable events – meteor strikes, flipping of the earth’s axis, pandemics – is proof of this.

Really, it is quite amazing how the billions on this planet manage to muddle through.

But I am thinking here of climate change as a tangential force.

And it is also a huge challenge.

But it is a remarkably subtle thing, not withstanding the on-the-ground reality of droughts, hurricanes, tornados, floods, and so forth.

And it is something smack in the sweet spot of forecasting.

There is no discussion of suitable responses to climate change without reference to forecasts of global temperature and impacts, say, of significant increases in sea level.

But these things take place over many years and, then, boom a whole change of regime may be triggered – as ice core and other evidence suggests.

Flexibility, Redundancy, Avoidance of Over-Specialization

My brother (by a marriage) is a priest, formerly a tax lawyer. We have begun a dialogue recently where we are looking for some basis for a new politics and new outlook, really that would take the increasing fragility of some of our complex and highly specialized systems into account – creating some backup systems, places, refuges, if you will.

I think there is a general principle that we need to empower people to be able to help themselves – and I am not talking about eliminating the social safety net. The ruling groups in the United States, powerful interests, and politicians would be well advised to consider how we can create spaces for people “to do their thing.” We need to preserve certain types of environments and opportunities, and have a politics that speaks to this, as well as to how efficiency is going to be maximized by scrapping local control and letting global business from wherever come in and have its way – no interference allowed.

The reason Reid and I think of this as a search for a new politics is that, you know, the counterpoint is that all these impediments to getting the best profits possible just result in lower production levels, meaning then that you have not really done good by trying to preserve land uses or local agriculture, or locally produced manufactures.

I got it from a good source in Beijing some years ago that the Chinese Communist Party believes that full-out growth of production, despite the intense pollution, should be followed for a time, before dealing with that problem directly. If anyone has any doubts about the rationality of limiting profits (as conventionally defined), I suggest they spend some time in China during an intense bout of urban pollution somewhere.

Maybe there are abstract, theoretical tools which could be developed to support a new politics. Why not, for example, quantify value experienced by populations in a more comprehensive way? Why not link achievement of higher value differently measured with direct payments, somehow? I mean the whole system of money is largely an artifact of cyberspace anyway.

Anyway – takeaway thought, create spaces for people to do their thing. Pretty profound 21st Century political concept.

Coming attractions here – more on predicting the stock market (a new approach), summaries of outlooks for the year by major sources (banks, government agencies, leading economists), megatrends, forecasting controversies.

Top picture from FIREBELLY marketing

Links – Data Science

I’ve always thought the idea of “data science” was pretty exciting. But what is it, how should organizations proceed when they want to hire “data scientists,” and what’s the potential here?

Clearly, data science is intimately associated with Big Data. Modern semiconductor and computer technology make possible rich harvests of “bits” and “bytes,” stored in vast server farms. Almost every personal interaction can be monitored, recorded, and stored for some possibly fiendish future use, along with what you might call “demographics.” Who are you? Where do you live? Who are your neighbors and friends? Where do you work? How much money do you make? What are your interests, and what websites do you browse? And so forth.

As Edward Snowden and others point out, there is a dark side. It’s possible, for example, all phone conversations are captured as data flows and stored somewhere in Utah for future analysis by intrepid…yes, that’s right…data scientists.

In any case, the opportunities for using all this data to influence buying decisions, decide how to proceed in business, to develop systems to “nudge” people to do the right thing (stop smoking, lose weight), and, as I have recently discovered – do good, are vast and growing. And I have not even mentioned the exploding genetics data from DNA arrays and its mobilization to, for example, target cancer treatment.

The growing body of methods and procedures to make sense of this extensive and disparate data is properly called “data science.” It’s the blind man and the elephant problem. You have thousands or millions of rows of cases, perhaps with thousands or even millions of columns representing measurable variables. How do you organize a search to find key patterns which are going to tell your sponsors how to do what they do better?

Hiring a Data Scientist

Companies wanting to “get ahead of the curve” are hiring data scientists – from positions as illustrious and mysterious as Chief Data Scientist to operators in what are almost now data sweatshops.

But how do you hire a data scientist if universities are not granting that degree yet, and may even be short courses on “data science?”

I found a terrific article – How to Consistently Hire Remarkable Data Scientists.

It cites Drew Conway’s data science Venn Diagram suggesting where data science falls in these intersecting areas of knowledge and expertise.


This article, which I first found in a snappy new compilation Data Elixir also highlights methods used by Alan Turing to recruit talent at Benchley.

In the movie The Imitation Game, Alan Turing’s management skills nearly derail the British counter-intelligence effort to crack the German Enigma encryption machine. By the time he realized he needed help, he’d already alienated the team at Bletchley Park. However, in a moment of brilliance characteristic of the famed computer scientist, Turing developed a radically different way to recruit new team members.

To build out his team, Turing begins his search for new talent by publishing a crossword puzzle in The London Daily Telegraph inviting anyone who could complete the puzzle in less than 12 minutes to apply for a mystery position. Successful candidates were assembled in a room and given a timed test that challenged their mathematical and problem solving skills in a controlled environment. At the end of this test, Turing made offers to two out of around 30 candidates who performed best.

In any case, the recommendation is a six step process to replace the traditional job interview –


Doing Good With Data Science

Drew Conway, the author of the Venn Diagram shown above, is associated with a new kind of data company called Data Kind.

Here’s an entertaining video of Conway, an excellent presenter, discussing Big Data as a movement and as something which can be used for social good.

For additional detail see http://venturebeat.com/2014/08/21/datakinds-benevolent-data-science-projects-arrive-in-5-more-cities/