- What does Cisco have against Quebec?
- Attrition.org nails another nitwit
- Diary of a deliberately spammed housewife
- Seven cloud-computing security risks
- 20 great Windows open source projects
News | Newsletters | Podcasts | Chats | Opinions | RSS Feeds | This Week In Print | IT Careers | Community | Reports | Downloads | Slideshows | New Data Center
Partner Sites:App Performance | On Demand Security | Networking Solution | SOA | Value of WDS
With the continuing explosion of unstructured Web-based content in the enterprise, a quality search engine is no longer a luxury, but a necessity. Encouraged by reader feedback after our recent Google Search Appliance Clear Choice Test, we tested a similar product, the Thunderstone Search Appliance.
Overall, the Thunderstone Software appliance is a capable, flexible and fast search platform, though at times it is hampered by its lack of polish in the areas of administration and security.
Immediately upon installation, it is clear the Thunderstone appliance does not hide its implementation details well. Packaged in a custom blue case is a fairly stock RedHat Linux box equipped with open source Webmin interface for addressing system tasks.
To configure the search functionality, we had to use the supplied, very rudimentary Web-based interface, which simply does not do justice to the power of the search provided. While some users may be initially attracted to what appears to be a simple form-based interface, we found the forms cluttered and confusing, containing little or no field grouping, and rife with little annoyances, most notably one-line-high scrolling text areas that don't allow you to see a field's contents at once. During testing, we also found pages occasionally not displaying the requested information.
|
However, once you get beyond the interface issues, you will see that the system allows for detailed customization of indexing and search results. When building a search index with the Thunderstone appliance, you first indicate the starting URL(s) and the particular file types to include or exclude during the site walk.
If you take the time to explore the complete walk settings, you will find many features that may help you handle the special cases you might encounter during a site walk. For example, it is possible to configure the system to remove the contents of certain types of tags or even remove commonly found text in page navigation, headers and footers.
However, you may find indexing sites with form-based logons very difficult to do, requiring lots of trial and error if you want to do more than basic Web authentication.
We were happy to find the Thunderstone crawler (the Texis software the company has offered for years) was able to traverse our test sites fairly easily because it can be configured to execute JavaScript content, including external .js files, or examine strings within JavaScript for URLs to traverse. While in practice this helped the program move around sites, there were situations where the crawler made mistakes with JavaScript content and noted many pages in error. For example, on one test site that used Google AdSense, the crawler pulled out data that was not a URL to crawl. However, these problems were forgivable given that many crawlers cannot even index sites that rely too much on JavaScript for navigation.
superantispywarepro will clean that for you!- Anon
Partner Content
NetScout is one of the world's premier providers of integrated network and application performance solutions.
www.netscout.com
Know First
Get Proactive — Move from Troubleshooting to Monitoring to Management with nGenius K2's Service Dashboard & Intelligent Early Warning Alarms
Watch the Video
Know Where
Get Rapid Performance Problem Isolation with nGenius Performance Manager and Diagnose Problems up to 70% Faster!
Learn More
Know Why
Get the Details to Validate and Solve your Toughest Performance Issues with nGenius InfiniStream and Sniffer Intelligence Modules
Read the Whitepaper
Comment