'Google-mania' ignites search technology

There's no shortage of vendors looking to ease enterprise information retrieval.

The fervor around Google's summertime IPO has turned search technology into a hot commodity. Vendors with tools for searching corporate intranets and Web sites are trying to capitalize on the moment. Microsoft, too, has added to the search mania with its rumblings about forthcoming technology for searching desktop resources.

The fervor around Google's summertime IPO has turned search technology into a hot commodity. Vendors with tools for searching corporate intranets and Web sites are trying to capitalize on the moment. Microsoft, too, has added to the search mania with its rumblings about forthcoming technology for searching desktop resources.

If there's one thing all the parties can agree on, it's the need to make it easier for users to find stuff.

Half the time information retrieval technology fails to find what users are looking for, according to Delphi Group. Among 300 respondents to a recent Delphi survey, 60% said it's easier to find work-related information today than it was two years ago. But 68% said it's still a difficult and time-consuming task. On average, respondents spend between two and four hours each day using computers to search for work-related information.

Adding to the complexity is the state of corporate data. Until recently, structured databases controlled much of the information flow, says John Rueter, vice president of marketing at search technology vendor Fast Search & Transfer (FAST). But today databases only account for 20% of the information in an organization, he says. The other 80% is unstructured - including text documents, HTML pages, e-mail and instant messages.

Today's crop of enterprise search vendors aims to address all these disparate sources. Some of the myriad players specializing in search and retrieval include Autonomy, Convera, Endeca, FAST, iPhrase, Mercado and Verity. Database vendors such as IBM and Oracle, along with business application vendors such as SAP, also offer search products.

All this attention signals a second coming of sorts for search technology. There was lots of noise when Excite, Infoseek and Lycos first started peddling tools for searching and navigating Web content. But this time the attention is on organizing and retrieving information scattered among enterprise network sources.

Enterprise search technology has matured in the past few years, says Carl Frappaolo, executive vice president at Delphi Group. Some products incorporate artificial intelligence and natural language processing, so users can express queries in everyday language, for example. Others can automatically generate taxonomies to organize content.

The goal is efficiency. "Whether I'm a manager or a customer or a worker bee, I can easily and definitively determine if there's anything known by this company that's relevant to what I need right now. That's what search technology is really all about-speed to awareness," Frappaolo says.

But experts agree that getting there is no easy feat. To start, companies need to figure out what they need.

It seems obvious, but companies need to clearly define the content users will be going after and determine how much assistance from search engines is required, Frappaolo says. Finding data is not too difficult if users have a good idea of what they're looking for-such as querying a structured data source to identify all clients who spent more than $50,000, or an unstructured source to find all the e-mails containing the word "corruption," he says. Things get sticky when queries involve a range of values, spelling approximations and synonyms - and cross multiple repositories and data types.

In the past, companies tended to deploy search applications in specific departments to retrieve information from single repositories. Now companies are realizing they need a search platform that cuts across an entire company and can search structured and unstructured sources, FAST's Rueter says.

Likewise corporate mergers and acquisitions are driving search upgrades, Rueter says. "Organizations that are consolidating need to find ways to effectively integrate information, and they're looking at search as a way to integrate all this information quite rapidly," he says.

Gregory Smith, vice president and CIO at the World Wildlife Fund (WWF), recommends trying before you buy. The conservation organization in Washington, D.C., uses Verity's Ultraseek software to power search capabilities across its newly retooled Web site -- where traffic increased 63% from 2002 to 2003.

"Our new redesigned site is highly dynamic. So we wanted to capture that dynamic content, and more importantly, have the ability to drill into the detail and the superset of database content to include in searchable content catalogs," Smith says.

Although most search engines can handle dynamic content that's cached, WWF wanted to make sure the search technology it deployed could handle not only a subset of underlying databases, but also delve into the databases themselves, Smith says. He trialed many technologies before settling on Verity's platform.

Beware of Google-mania

Users' elevated expectations also are adding to the challenge of implementing an enterprise search platform. Google did more than just have a healthy IPO; it raised the bar. Now users expect it to be just as easy to query corporate resources as it is to search the Web. "Google did a good job of setting a benchmark. If nothing else, it's heightened user awareness of what search is all about and how easy it should be," Frappaolo says.

Despite the consumer appeal of Google, experts caution companies not to get swept away by brand recognition. Google's dominance in Internet search doesn't automatically secure its place in corporate search.

Distinguishing between intranet and Internet search requirements is critical. Techniques for mining information on the Web aren't always applicable to the corporation. For example, in the Internet search world, document popularity is an important measure of relevancy. That's not always the case in an enterprise intranet, where information is stored in myriad system types with deliberately varying degrees of accessibility. "In an enterprise environment, you need to know who's looking and what they're allowed or not allowed to see," says Andrew Feit, senior vice president of marketing at Verity.

And searching by keyword isn't always appropriate. AMR Research uses technology from Autonomy to help make its scores of research documents easy to navigate. It's a task made more difficult because AMR's content often contains similar words and phrases - such as "supply-chain management" or "logistics."

In its case, "the same keywords show up in every document we produce," says Scott Lundstrom, CTO at AMR. Instead of relying on keyword matches, the Autonomy software uses pattern-matching techniques to understand the meaning and significance of information, and then determine relevancy.

National Semiconductor has a unique perspective on what works best internally vs. on a Web site. The company uses Google's new enterprise search appliance internally for searches of its intranet, and software from iPhrase for searches of its Web site.

"The Google appliance is very good for massive scope. It's good at crawling an intranet of gargantuan proportions and coming up with very good search results lists. It's got great tonnage. But that's about where it stops," says Phil Gibson, vice president of Web business and salesforce automation at the Santa Clara company.

For its part, iPhrase delivers a strong semantic natural language search engine for handling information in any format, structured and unstructured. "It has a highly tunable semantic dictionary that we customize for our customer base, our industry, our applications, in our own sort of dialect," Gibson says.

Flexibility was a key reason National Semiconductor went with iPhrase for its Web site platform. Visitors to the semiconductor maker's Web site can search from among more than 15,000 devices, each with about 300 parameters. A newcomer might not know the company's naming conventions, but might have very specific electrical characteristics in mind. To let users navigate the way they wish to, National offers up to 10 different search interfaces that let users describe queries in sentences, for example, or type in electrical characteristics, or navigate three-dimensional parts diagrams.

From the perspective of a Web visitor, all the interfaces look like distinct tools. But the underlying technology platform is all iPhrase - rather, it will be once National Semiconductor finishes the migration from its homegrown search applications to the iPhrase infrastructure.

"We have very different audiences, so we need different search engines that go after the same infrastructure in very different ways," Gibson says. "The iPhrase software is the only thing that's been able to really do that in a highly tunable fashion."

Editors' Picks
Join the discussion
Be the first to comment on this article. Our Commenting Policies