Netflix: Open Collaboration is Recommended
Judging by the leaderboard the Netflix grand prize is in sights for a select-few researchers including Pragmatic Theory. The target RSME is 0.8563 and the best entry as of the time this post is being written is 0.8582.
If you are at all interested in machine learning, AI or operations research, you would of heard about the Netflix competition that's been ongoing for over 3 years. If not, be advised that online movie-rental company, Netflix, have been running an open competition with 1 million USD up for grabs for anyone who can invent a collaborative filtering algorithm for movie ratings that beats their in-house algorithm (Cinematch) by 10% by 2011.
Because it's in their best interest to assist researchers, Netflix has a training data set of 100 million recommendations stripped of all PII. At the time the competition started way back in 2006, they actually had 103 million recommendations so the 3 million they didn't include in the training set are what is being used to evaluate any submitted recommendation systems. In the world of machine learning the golden rule is - the more data you have the better!
Netflix will clearly benefit from such an improvement when one is found - which looks to be real soon! - since it directly translates to increased movie rental revenue and lower subscription cancellation rates. Since they charge a fixed monthly subscription fee they know that users who don't rent enough movies per period will realise that they are not getting value for money and will cancel their subscription. Hence the goal for Netflix is to be able to figure out what the customer likes and have a ready supply of recommendations so they user never runs out of movies they want to see. However we need to be cognisant of the fact that in the online world, where inventory holding costs are negligible due to digital storage movie-retailers like Netflix can carry a significantly greater inventory of movies than traditional bricks-and-mortal rental outlets. Thus the poor user is faced with the paradox of choice making good recommendations even more important to their business model.
It's been an incredibly shrewd move by Netflix as it is a cost effective way for them to harness the resources of the (interested) research community for a fixed budget with a fixed timeframe. So in effect all 3 pillars of the infamous project-triangle are fixed! That's a project manager's dream. If they tried to do the R&D in-house they'd be unlikely to attract a team of individuals that can outperform the "open community", and in all likelihood it would take them longer and cost them more than 1 million USD to advance the science to the level they want to. In essence, the are employing the wisdom of crowds to good effect.
What's interesting is the type of folks who have done well in the competition, and the degree of collaboration between participants. As expected there is a decent smattering of professional research labs and academic mathematics departments near the top of the leaderboard, but there are also lone researchers and participants which aren't recognized experts in the field. One classic example is Gavin Potter, going under the guise of "Just a guy in a garage" got massive exposure from this article in Wired magazine for applying more non-mathematical notions in his approach which did fairly well for a while. An even with 1 million dollars up from grabs many of the teams entered openly share their approaches and experiences with others. If ever you wanted an example of how collaboration and the "wisdom of crowds" can advance our knowledge than, other than Wikipedia, this is it. You have to think that for the same reasons, open source software must, if it hasn't already, eventually overtake proprietary software systems if enough people contribute.
21 Jun 2009 Damien Wintour







