* ht://Dig from the ht://Dig Group Building effective search features into Web sites is easy when the amount of content is limited but when you get involved with documentation, lists, or anything else that is voluminous then you need something rather more industrial. Free would be even nicer.OK, here’s the answer: ht://Dig from the ht://Dig Group (see links below), a powerful medium scale search engine released under the GNU General Public License.According to the group: “The ht://Dig system is a complete World Wide Web indexing and searching system for a domain or intranet. This system is not meant to replace the need for powerful Internet-wide search systems like Lycos, Infoseek, Google and AltaVista. Instead it is meant to cover the search needs for a single company, campus, or even a particular sub section of a Web site.”ht://Dig can span multiple servers as long as they understand HTTP because the tool builds its index by crawling the sites to be indexed as if it were a Web browser. To run ht://Dig you’ll need a Unix machine and both a C and a C++ compiler (C++ is needed for ht://Dig itself while the C compiler is needed to compile some of the GNU libraries).ht://Dig has been tested on these machines with these compilers: * Sun Solaris SPARC 2.X (using gcc/g++)* Sun SunOS 4.1.4 SPARC (using gcc/g++ 2.7.0)* HP/UX 10.X (using gcc/g++)* IRIX 5.3 and 6.X (SGI C++ compiler.)* Most Linux Distributions (using gcc/g++)* Most BSD platforms, including BSDI and Mac OS X (using gcc/g++) The tool has also been implemented under IIS on Windows.Like any search system ht://Dig requires a lot of disk space. For example, an index for 13,000 documents with a full word index will require about 150M bytes of storage.ht://Dig has a long feature list that includes: Support for the Robot Exclusion Protocol; Boolean expression searching; configurable search results using HTML templates; fuzzy searching; multiple searches algorithms including exact, soundex, metaphone, stemming (common word endings), synonyms, accent stripping, and substring and prefix; support for searching HTML and text files; e-mail notification of expired documents; and SGML entities like ‘à’ and ISO-Latin-1 characters can be indexed and searched.If you want to see ht://Dig in action, check out the Web site for the National Public Radio station KCRW in Los Angeles (the greatest radio station ever) – it makes extensive use of the subsystem which performs excellently. Related content how-to Doing tricks on the Linux command line Linux tricks can make even the more complicated Linux commands easier, more fun and more rewarding. By Sandra Henry-Stocker Dec 08, 2023 5 mins Linux news TSMC bets on AI chips for revival of growth in semiconductor demand Executives at the chip manufacturer are still optimistic about the revenue potential of AI, as Nvidia and its partners say new GPUs have a lead time of up to 52 weeks. By Sam Reynolds Dec 08, 2023 3 mins CPUs and Processors Technology Industry news End of road for VMware’s end-user computing and security units: Broadcom Broadcom is refocusing VMWare on creating private and hybrid cloud environments for large enterprises and divesting its non-core assets. By Sam Reynolds Dec 08, 2023 3 mins Mergers and Acquisitions news analysis IBM cloud service aims to deliver secure, multicloud connectivity IBM Hybrid Cloud Mesh is a multicloud networking service that includes IT discovery, security, monitoring and traffic-engineering capabilities. By Michael Cooney Dec 07, 2023 3 mins Network Security Network Security Network Security Podcasts Videos Resources Events NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe