Real Time Search is an interesting topic - for most consumer uses its not really that useful, until something important happens that means the search indexers (like Google) are going to be totally out of kilter - like say
the election of a new President.
Quick Lecture Time - most major search engines today use index based search, a 2 stage process - spiders crawl across the web, collecting web pages and changes thereto, and creating massive indexes which Google etc store. When you type a search, its the index that is searched. If a tree falls in your world but the spiders don't see it, as far as Google is concerned it hasn't happened. Real Time Search, on the other hand, goes and looks at the original sources every time a search is initiated
The problem with Real Time Search is that to search the whole world wide web in real time is not feasible, the task is too huge - you would have a real time wait of several hours (at least) and it ain't cheap to do it this way - so you have to start to narrow the searched fields - ie introduce context into what you are looking for so the system can knock out stuff not wanted before it even goes near it.
But even this winnowing can still make the task too daunting, so one also has to look at near real time approaches - one of the most powerful of which is the Publish / Subscribe model (Pub/Sub) - where the search target continually publishes its updates, to which you subscribe (think Twitter). To extend the reach, the publishing entity can be a focussed aggregation / spidering system that is cycling round and updating itself in "good enough" near real time.
Which brings us to the resurrection of Pubsub, a real time search play that closed its doors in 2007 - as
RWW notes:
You can either give it a number of keywords to search for, or subscribe to a topic. The choice of topics is extremely broad (think 'Nintendo,' 'Micronesia,' and 'Hepatitis'). Every time a blog posts an article that either includes one of your keywords or fits into one of the categories that you subscribe to, PubSub will update your search results in real-time.
Because PubSub is part of the Ping-o-Matic blog update system, these updates often happen only seconds after a story went live.
(Pingomatic being an aggregation service as mentioned in my microlecture above)
As RWW notes, the curious thing is that Google has not yet entered this real time game, something that has always surprised us:
Clearly, this is the kind of service that Google, thanks to owning the ubiquitous FeedBurner service, could easily have created itself, but as we pointed out last week, Google seems to have missed the boat on the real-time web.
.
A few years ago, most of the early 2000's Real Time engines closed down, but the broadband mediaweb is starting to see their re-emergence methinks.
(Disclosure - at Broadsight we build Real Time search technology for context specific client applications, such as
this one)