A startup company called Powerset gained a slew of headlines last week when it launched a beta version of its search engine, which like other offerings employs natural language processing, allowing users to search sets of information in the form of questions.
But the future of search, particularly within enterprises, will go well beyond processing queries or parsing content. Future search systems will get to know the user - and communities of users - as much as the content it crawls, analyses and indexes, observers say.
How does your IT infrastructure measure up?
Receive the 2008 UK company IT infrastructure benchmarking report, by completing the CIO survey.
This survey will uncover the latest industry trends, highlight rising CIO concerns and popular IT investments in various vertical sectors. Participation is free and qualifies you to receive a complimentary copy of the 2008 CIO Benchmarking Report so you can evaluate how your organisation compares.
"Relevance is in the eye of the beholder - what's relevant for me may not be relevant for you. Consequently, what's needed is a profile of the user (interests, vocabulary, previous searches, job title, etc) and a profile of the content (author, subject, date, who's read it, etc). Great search matches the two up," said Guy Creese, an analyst with Burton Group.
"To do that, these profiles need to be equally sophisticated. Enterprise search vendors for a long time have spent a lot of effort on profiling content, but not profiling users. This will change over time, as systems such as Amazon.com make it clear that knowing a lot about the user makes it easier to find and suggest relevant content."
For example, Creese said, if a user was a network engineer and entered "ATM" as a query, a smart search system could rank results for "asynchronous transfer mode" more highly than "automated teller machine."
While many companies have a role to play and products that work, Google is the company to watch in the long term if you want to know where enterprise search is headed, according to analyst Stephen Arnold.
"When you hear the big companies saying, we are doing an enterprise solution and Google isn't a problem, you have to ask yourself, are these guys connected to reality?" he said during a recent speech at the Infonortics Search Engine Meeting in Boston. "Buying into the Wall Street crowd's [contention] that this is an advertising company is crazy."
In the meantime, the search market has fragmented into a few distinct size classes, analysts say: offerings from major vendors like IBM, Oracle and with its recent acquisition of FAST Search & Transfer, Microsoft, larger independents such as Autonomy, and smaller, specialised vendors.
Arnold recently wrote a nearly 300-page study for Gilbane Group, "Beyond Search," that takes a deep dive into the facets of the enterprise search market. While in terms of size, search-focused companies are spread among only a handful of categories, but they vary widely in terms of their technological focus. These are among the sub-segments Arnold identified:
* Database-centric systems, such as Teratext and Intelligenx. "Because of this, these systems are adept at handling data management, content repurposing, and generating reports from the content that reside in the system's database," he wrote.
* Companies involved in "deep analysis" of content, which include Attensity and Siderean Software. "The use of multiple processes in iterative cascades point to the direction search and content processing is moving. Simple key word indexing is a Model-T Ford to these vendors' finely tuned machines."
* "Tools" companies like SchemaLogic sell software that helps customers organise and prepare their content to be searched, according to Arnold. "Most licensees of search systems don't know what they don't know," he wrote. "Once you have some experience with behind-the-firewall search, you have a better understanding of the importance of controlling and managing metadata."
There are also "building block," "linguistic processing" and "pattern analysis" vendors, Arnold wrote.
Though a plethora of companies are vying for market share, there may be plenty to go around. Analyst firm Gartner recently predicted search technology will locate and analyse more than 90 per cent of the data in more than half of the Global 2000 by the end of 2012.
Some observers point to Microsoft purchase of FAST as an indication the market had reached a sort of tipping point.
What Microsoft plans for FAST is still in their beginning stages. Initially, its SharePoint collaboration platform will serve as "a centre of gravity," said Jared Spataro, a company spokesman.
He indicated that Microsoft, which has tried but so far failed to buy Yahoo, partly to boost its hand in web search, will embed enterprise search throughout its products: "Search will be everywhere in the future. Every application interface."
"If I were to point to any one thing, it's that search is still a new and emerging market," Spataro said. "The real opportunity for us is that there's more green-field than anything."
It is an apt observation in light of the reality within enterprises today. Companies that agreed to speak about their implementations revealed that while the basic work of indexing content and providing internal users with search results is well under way, it could be years before they tap the capabilities described by Creese and others.
The Honeywell transportation systems division was an early adopter of the Google Search Appliance, which replaced a limited, older search tool, said Jerry Ibrahim, director of IT for emerging technologies and innovation.
The company was drawn to the Google offering because it is appliance-based and installation was "a breeze." It used homegrown tools to integrate with various data sources and applications and now is experimenting with the Google OneBox API (application programming interface) for making those connections, he said.
To explain the company's goals for enterprise search, Ibrahim gave the example of asking a newly hired engineer a specific question about one of the company's products. "You ask someone who's been in Honeywell for 10 years, they'll know. The guy who's been there one month won't know. And he'll spend a week trying to figure it out."
But if Ibrahim was to ask the worker a general question, such as how many hummingbirds there are in the world, he'd likely go to Google and find the answer within minutes.
"That's our journey, to get to that point for our own internal stuff," he said.
Looking forward, the company is thinking about ways to pull in user information and improve results, he said: "We want to start collecting these statistics and putting some more advanced thought and logic and biasing into [the system]."
Edens & Avant, which owns and develops shopping centres on the East Coast, may be a little further along the visionary path set out by observers like Creese.
The company uses the Oracle Secure Enterprise Search product, said Dale Johnston, vice president and chief information officer. The search technology works in concert with the company's portal, which it has "personified," Johnston said.
"We developed this concept that the corporate intranet was actually a co-worker, your best friend at work. The person, if you had a question, you'd look to them to get the answer," he said.
The portal also includes a social-networking component, with employees able to maintain profiles, he said: "We're hoping that we can prioritise the search results based on what people are working on."
However, adoption of the social-networking component has been "very poor," limiting the value of the data available, he said.
"People will use it when they find that it's helping benefit them," he predicted. The company plans to set up automatic triggers that will remind users to update their profiles, according to Johnston.
Yet, even as its search strategy grows in sophistication, the company remains engaged in basic IT trench warfare.
It has roughly 32 data sources and has finished processing seven of them for search, Johnston said. The project started in March 2007, and he expects it will take 36 months to complete all of them.