The public will get its first chance Monday to test a search engine from start-up Powerset that eschews conventional keyword technology and instead is designed to understand the meaning of Web pages.
As such, Powerset's search engine holds the promise of fundamentally changing people's expectations for search engines by, in theory, offering a smarter, more efficient experience.
However, Powerset's beta version, while delivering impressive results, has a limited scope and index, leaving unanswered questions about its ability to work its magic at the massive scale of Google's keyword-based search engine.
"We're changing the way information is searched by doing a much deeper analysis of the pages we index," said Scott Prevost, Powerset's product director.
Keyword engines treat pages as word bags, indexing their content without grasping its meaning, he said. Meanwhile, Powerset's engine, applying technology developed in-house as well as licensed from Xerox's PARC subsidiary, creates a semantic representation by parsing each sentence and extracting its meaning. "Meaning is what we index," he said.
In an interview in October with IDG News Service, Marissa Mayer, Google's vice president of Search Products & User Experience, acknowledged that the company's search engine should -- and will -- overcome its keyword dependence in time.
"People should be able to ask questions and we should understand their meaning, or they should be able to talk about things at a conceptual level. We see a lot of concept-based questions -- not about what words will appear on the page but more like 'what is this about?'. A lot of people will turn to things like the semantic Web as a possible answer to that," she said.
But she added that Google's search engine acts smart thanks to the humongous amount of data it crunches. "With a lot of data, you ultimately see things that seem intelligent even though they're done through brute force," she said. As examples, she cited a query like "GM," which the engine interprets as "General Motors" but if the query is "GM foods," it delivers results for "genetically-modified foods." "Because we're processing so much data, we have a lot of context around things like acronyms. Suddenly, the search engine seems smart, like it achieved that semantic understanding, but it hasn't really," she said.
For now, Powerset's index is very limited, consisting only of millions of pages from Wikipedia and Metaweb Technologies' Freebase, a Web-based structured database of information. However, Prevost vows that the index will begin growing within a month after its launch and eventually rival in size those of Google, Yahoo and others. "Our technology fully scales," he said.
Still, it's impressive to see Powerset's search engine in action and the promise it holds. Instead of returning the proverbial 10 blue links for search results, Powerset can do more, such as assembling a collection of facts related to the query, as well as summarize the found information. It can also provide direct answers to factual questions.