As well as being a comms system an directory rolled into one, Social Media has had another role - low cost search and selection of relevant content. However, as we and others noticed when comparing Last.fm with Pandora, a social network system 's point at which it clould improve no further was limited compared to the algorithmic approach. Well that was 2006 and that was audio, this is 2008 and the next thing that will need to go through the selection mill is video.
I've always been fascinated with Video selection, as it is a far more complex task than text and audio selection, but due to the sheer size of the video industry, solving it is a huge prize. In 2005, state of the art was a 5 movie trick, but the game was upped rapidly. And thus I've been reading (with fascination)
this article in the NYT about the efforts Netflix has been making, initially to find selections from its own algorithms, and later trying to crowdsource the approach - there is a $1million prize for building a system that is a 10% better predictor of the movies you will like than the Netflix in-house system, Cinematch.
The state of the art right now is based around singular value decomposition approaches:
Singular value decomposition works by uncovering “factors” that Netflix customers like or don’t like. Say, for example, that “Sleepless in Seattle” has been rated by 200,000 Netflix users. In one sense, this is just a huge list of numbers — user No. 452 gave it two stars; No. 985 gave it five stars; and so on. But you could also think of those ratings as individual reactions to various aspects of the movie. “Sleepless in Seattle” is a “chick flick,” a comedy, a star vehicle for Tom Hanks; each customer is reacting to how much — or how little — he or she likes “chick flicks,” comedies and Tom Hanks. Singular value decomposition takes the mass of Netflix data — 17,770 movies, ratings by 480,189 users — and automatically sorts the films. The programmers do not actively tell the computer what to look for; they just run the algorithm until it groups together movies that share qualities with predictive value.
As you would expect, the hard thing to do is to resolve extreme outliers:
There’s some X factor in human judgment that the current bunch of algorithms isn’t capturing when it comes to movies like “Napoleon Dynamite.” [ A movie that people either love or hate, and unpredictably so] And the problem looms large. Bertoni is currently at 8.8 percent; he says that a small group of mainly independent movies represents more than half of the remaining errors in the way of winning the prize. Most teams suspect that continuing to tweak existing algorithms won’t be enough to get to 10 percent. They need another breakthrough — some way to digitally replicate the love/hate dynamic that governs hard-to-pigeonhole indie films.
This is the first time I'd heard of this, (and the $1m prize

). Now, we know of some really powerful problem solving algorithms that do work in uncertain spaces like these in other industries, so it may be interesting to give them a go. Lets see.....