Category Archives: crowdsourcing analytics

Links – early July 2014

While I dig deeper on the current business outlook and one or two other issues, here are some links for this pre-Fourth of July week.

Predictive Analytics

A bunch of papers about the widsom of smaller, smarter crowds I think the most interesting of these (which I can readily access) is Identifying Expertise to Extract the Wisdom of Crowds which develops a way by eliminating poorly performing individuals from the crowd to improve the group response.

Application of Predictive Analytics in Customer Relationship Management: A Literature Review and Classification From the Proceedings of the Southern Association for Information Systems Conference, Macon, GA, USA March 21st–22nd, 2014. Some minor problems with writing English in the article, but solid contribution.

US and Global Economy

Nouriel Roubini: There’s ‘schizophrenia’ between what stock and bond markets tell you Stocks tell you one thing, but bond yields suggest another. Currently, Roubini is guardedly optimistic – Eurozone breakup risks are receding, US fiscal policy is in better order, and Japan’s aggressively expansionist fiscal policy keeps deflation at bay. On the other hand, there’s the chance of a hard landing in China, trouble in emerging markets, geopolitical risks (Ukraine), and growing nationalist tendencies in Asia (India). Great list, and worthwhile following the links.

The four stages of Chinese growth Michael Pettis was ahead of the game on debt and China in recent years and is now calling for reduction in Chinese growth to around 3-4 percent annually.

Because of rapidly approaching debt constraints China cannot continue what I characterize as the set of “investment overshooting” economic polices for much longer (my instinct suggests perhaps three or four years at most). Under these policies, any growth above some level – and I would argue that GDP growth of anything above 3-4% implies almost automatically that “investment overshooting” policies are still driving growth, at least to some extent – requires an unsustainable increase in debt. Of course the longer this kind of growth continues, the greater the risk that China reaches debt capacity constraints, in which case the country faces a chaotic economic adjustment.

Politics

Is This the Worst Congress Ever? Barry Ritholtz decries the failure of Congress to lower interest rates on student loans, observing –

As of July 1, interest on new student loans rises to 4.66 percent from 3.86 percent last year, with future rates potentially increasing even more. This comes as interest rates on mortgages and other consumer credit hovered near record lows. For a comparison, the rate on the 10-year Treasury is 2.6 percent. Congress could have imposed lower limits on student-loan rates, but chose not to.

This is but one example out of thousands of an inability to perform the basic duties, which includes helping to educate the next generation of leaders and productive citizens. It goes far beyond partisanship; it is a matter of lack of will, intelligence and ability.

Hear, hear.

Climate Change

Climate news: Arctic seafloor methane release is double previous estimates, and why that matters This is a ticking time bomb. Article has a great graphic (shown below) which contrasts the projections of loss of Artic sea ice with what actually is happening – underlining that the facts on the ground are outrunning the computer models. Methane has more than an order of magnitude more global warming impact that carbon dioxide, per equivalent mass.

ArcticSeaIce

Dahr Jamail | Former NASA Chief Scientist: “We’re Effectively Taking a Sledgehammer to the Climate System”

I think the sea level rise is the most concerning. Not because it’s the biggest threat, although it is an enormous threat, but because it is the most irrefutable outcome of the ice loss. We can debate about what the loss of sea ice would mean for ocean circulation. We can debate what a warming Arctic means for global and regional climate. But there’s no question what an added meter or two of sea level rise coming from the Greenland ice sheet would mean for coastal regions. It’s very straightforward.

Machine Learning

EG

Computer simulating 13-year-old boy becomes first to pass Turing test A milestone – “Eugene Goostman” fooled more than a third of the Royal Society testers into thinking they were texting with a human being, during a series of five minute keyboard conversations.

The Milky Way Project: Leveraging Citizen Science and Machine Learning to Detect Interstellar Bubbles Combines Big Data and crowdsourcing.

Measuring the Intelligence of Crowds

Researchers at Microsoft Research in the UK and Cambridge University report some fascinating and potentially useful results on crowdsourcing, based on a study of aggregating questions from a standard IQ test on Amazon’s Mechanical Turk (AMT).

The AMT site provides a place where workers can find problems that requesters have set up for crowdsourcing.

The introductory page to the site looks like this (click to enlarge).

AMT

So here’s an interesting way for people to make some money working from home, at their own hours, and yet stay busy. I’d like to look more deeply into this in a future post, but what these Crowd IQ researchers did is divvy up the questions from a widely utilized IQ test on the AMT site. They studied the effects of changing several parameters on their measures of Crowd IQ, but basically found that, with five or more reputable workers in a group, the Crowd IQ was usually higher than that of the individual workers in the group.

The Abstract for their 2012 study Crowd IQ: Measuring the Intelligence of Crowdsourcing Platforms describes the research and findings succinctly:

We measure crowdsourcing performance based on a standard IQ questionnaire, and examine Amazon’s Mechanical Turk (AMT) performance under different conditions. These include variations of the payment amount offered, the way incorrect responses affect workers’ reputations, threshold reputation scores of participating AMT workers, and the number of workers per task. We show that crowds composed of workers of high reputation achieve higher performance than low reputation crowds, and the effect of the amount of payment is non-monotone—both paying too much and too little affects performance. Furthermore, higher performance is achieved when the task is designed such that incorrect responses can decrease workers’ reputation scores. Using majority vote to aggregate multiple responses to the same task can significantly improve performance, which can be further boosted by dynamically allocating workers to tasks in order to break ties.

The IQ test is Raven’s Standard Progressive Matrices (SPM). If you want to take the test, look here.

SPM is a nonverbal, multiple-choice intelligence test based on the theory of general ability. The general setup is as in the following example.

Ravenex

Free riders are an interesting problem in a site like the Mechanical Turk. So, if people get paid by the number of correct answers, some simply select responses at random to maximize the speed at which they can put up answers. Because of this, AMT has a reputation mechanism indicating the expected quality of work of a worker, based on his or her past performance.

This research is has real-world implications. For example, increasing the payment for tasks too much results in actually diminuishing the quality of the answers, for a variety of reasons the authors consider.

The “workers” in this AMT-based study did not consult with each other about the answers, but were grouped into teams somehow by the researchers.

Here is a chart showing the increase in crowd IQ with the number of people in the group.

MSFTcurve

Here a HIT refers to a Human Intelligence Task.

 Recommendations

First, experiment and monitor the performance. Our results suggest that relatively small changes to the parameters of the task may result in great changes in crowd performance. Changing parameters of the task (e.g. reward, time limits, reputation rage) and observing changes in performance may allow you to greatly increase performance. Second, make sure to threaten workers’ reputation by emphasizing that their solutions will be monitored and wrong responses rejected. Obviously, in a real-world setting it may be hard to detect free-riders without using a “gold-set” of test questions to which the requester already knows the correct response. However, designing and communicating HIT rejection conditions can discourage free riding or make it risky and more difficult. For instance, in the case of translation tasks requesters should determine what is not acceptable (e.g. using Google Translate) and may suggest that the response quality would be monitored and solutions of low quality would be rejected. Third, do not over-pay. Although the reward structure obviously depends on the task at hand and the expected amount of effort required to solve it, our results suggest that pricing affects not only the ability to s source enough workers to perform the task but also the quality of the obtained results. Higher rewards are likely to encourage a free-riding behavior and may affect the cognitive abilities of workers by increasing psychological pressure. Thus, for long term projects or tasks that are run repeatedly in a production environment, we believe it is worthwhile to experiment with the reward scheme in order to
find an optimum reward level. Fourth, aggregate multiple solutions to each HIT, preferably using an adaptive sourcing scheme. Even the simplest aggregation method – majority voting – has a potential to greatly improve the quality of the solution. In the context of more complicated tasks, e.g. translations, requesters may consider a two-stage design in which they first request several solutions, and then use another batch of workers to vote for the best one. Additionally, requesters may consider inspecting the responses provided by individuals that often disagree with the crowd – they might be coveted geniuses or free-riders deserving rejection.

Interesting stuff, and makes you want to try crowdsourcing.

The Evolution of Kaggle

Kaggle is evolving in industry-specific directions, although it still hosts general data and predictive analytics contests.

“We liked to say ‘It’s all about the data,’ but the reality is that you have to understand enough about the domain in order to make a business,” said Anthony Goldbloom, Kaggle’s founder and chief executive. “What a pharmaceutical company thinks a prediction about a chemical’s toxicity is worth is very different from what Clorox thinks shelf space is worth. There is a lot to learn in each area.”

Oil and gas, which for Kaggle means mostly fracking wells in the United States, have well-defined data sets and a clear need to find working wells. While the data used in traditional oil drilling is understood, fracking is a somewhat different process. Variables like how long deep rocks have been cooked in the earth may matter. So does which teams are working the fields, meaning early-stage proprietary knowledge is also in play. That makes it a good field to go into and standardize.

(as reported in http://bits.blogs.nytimes.com/2014/01/01/big-data-shrinks-to-grow/?_r=0)

This December 2013 change of direction pushed out Jeremy Howard, Kaggle’s former Chief Data Scientist, who now says he is,

focusing on building new kinds of software that could better learn about the data it was crunching and offer its human owners insights on any subject.

“A lone wolf data scientist can still apply his knowledge to any industry,” he said. “I’m spending time in areas where I have no industrial knowledge and finding things. I’m going to have to build a company, but first I have to spend time as a lone wolf.”

A year or so ago, the company evolved into a service-provider with the objective of linking companies, top competitors and analytical talent, and the more than 100,000 data scientists who compete on its platform.

So Kaggle now features CUSTOMER SOLUTIONS ahead of COMPETITIONS at the head of its homepage, saying We’re the global leader in solving business challenges through predictive analytics. The homepage also features logos from Facebook GE, MasterCard, and NASA, as well as a link Compete as a data scientist for fortune, fame and fun ».

But a look at the competitions underway currently highlight the fact that just a few pay a prize now.

Kaggleactivecomps

Presumeably, companies looking for answers are now steered into the Kaggle network. The Kaggle Team numbers six analysts with experience in several industries, and the Kaggle Community includes scores of data and predictive analytics whizzes, many with “with multiple Kaggle wins.”

Here is a selection of Kaggle Solutions.

KaggleSolutions

This video gives you a good idea of the current focus of the company.

This is a big development in a way, and supports those who point to the need for industry-specific knowledge and experience to do a good job of data analytics.