5th Nov 2011. Kaggle, a platform for predictive data modeling competitions, has raised $11 million in Series A financing led by Index Ventures and Khosla Ventures. SV Angel, Yuri Milner’s Start Fund, Stanford Management Company, which invests and manages Stanford University’s endowment and other financial assets, PayPal Founder Max Levchin; Google Chief Economist Hal Varian; and Applied Semantics’ Co-Founder and Factual Chief Executive Officer Gil Elbaz, all participated in the round as well.
Founded in Melbourne, Australia, Kaggle recently moved to San Francisco and is currently in a phase of rapid expansion following its fundraising.
The start-up was founded by Anthony Goldbloom, 28, who previously performed macroeconomic modelling for the Reserve Bank of Australia, the Treasury and the ANZ Bank, and is chaired by the former head of the Govt 2.0 taskforce, Dr Nicholas Gruen.
Postscript: in December 2011 Anthony Goldbloom was named one of Forbes’ 30 Under 30 top tech entrepreneurs.
Kaggle says the new funding will be used towards hiring (the company has just one developer currently) and for sales and marketing efforts.
Neil Rimer, partner at Index Ventures, will join Kaggle’s board of directors, and PayPal founder Max Levchin has been named chairman of the company.
Kaggle’s platform for predictive modeling competitions helps companies, governments, and researchers identify solutions to some of the world’s hardest data problems by posting them as competitions to a community of more than 17,000 PhD-level data scientists located around the world. It hosts competitions on behalf of companies or organisations which provide a limited set of data and a desired objective, whether this is predicting a future result or improving the method for results that have already been generated. A prize is awarded to the individual or team that most accurately predicts a result.
Past competitions have included developing models to predict how the HIV sequence evolves, forecasting global tourism numbers and competing against analysts from major financial institutions to predict the results of the 2010 soccer World Cup. It is currently running a competition to improve the world ranking system for chess players.
Here’s how it works. Companies, and organizations can post large data sets to the platform, and ask scientists to solve a problem or question from the data. The thousands of data scientists who participate in Kaggle competitions then develop algorithms to solve these large-scale problems and submit iterations of their algorithms throughout each competition.
Kaggle actually maintains a real-time leaderboard of each competition’s standings, so competitors are motivated to exceed the current benchmark until the competition closes. Once a competition ends, the sponsoring organization has a solution, and the field’s top entrants take home the competition prize. Thus far, data scientists from all over the world have submitted nearly 47,000 entries to various Kaggle competitions.
The data-prediction competition model was successfully used by United States company Netflix when it offered $1 million for the individual or team that could improve its movie recommendation service.Kaggle’s prize pool is modest by comparison, with hundreds of dollars and a variety of other prizes being handed out for five competitions.
The Kaggle community of data scientists comprises thousands of PhDs from quantitative fields such as computer science, statistics, econometrics, maths and physics. They come from over 100 countries and 200 universities. In addition to the prize money and data, they use Kaggle to meet, network and collaborate with experts from related fields. As Kaggle founder Anthony Goldbloom tells me, “we’re making big data science into a sport.”
Kaggle says the results have actually led to new data discoveries and breakthroughs across many industries. For example, a competition for NASA, the Royal Astronomical Society, and the European Space Agency identified new ways to map dark matter in the universe, while another competition helped better determine the likelihood that the health of a HIV patient would improve or deteriorate.
Another example was showcased by insurance company Allstate, which ran a Claim Prediction Challenge and wanted to determine which motor vehicles were more likely to end up in a car accident from their subset of users. Allstate provided two years of data on the cars insured by the company for scientists to run.
Kaggle is currently hosting the $3 million Heritage Health Prize, the largest medical prize ever, designed to help reduce billions of dollars in unnecessary hospitalizations.
“Kaggle is working on one of the most exciting opportunities in big data analytics that I’ve seen in the last twenty years,” said Vinod Khosla, founder and partner, Khosla Ventures. “Kaggle’s platform has the potential to change the way we tackle data analysis problems.”