A protocol for clogged intranet arteries
|
|
|||
|
|
Fresh content is the lifeblood of the World Wide Web, but it also is the cholesterol that clogs the arteries of many intranets.
Every time we download a Web page we've seen before or have an automated spider crawl a familiar Web site, we contribute to the problem. Typically, we retrieve an entire document or a set of Web pages from a particular URL when we only need to fetch the information that has been updated since our last visit.
What results is the downloading of redundant information, which wastes precious bandwidth, disk space and CPU cycles throughout the intranet. Proxy servers and push browsers only partially address this problem. Now Marimba, Netscape, Novell and Sun have proposed a simple, elegant solution and submitted it to the World Wide Web Consortium for standardization.
The proposed Distribution and Replication Protocol (DRP), operating over HTTP connections, would ensure that only new pages or changes to existing pages are downloaded to the requesting browser, application, proxy or Web site. Before accessing the desired pages or documents, the client first would retrieve an index that, when compared with a prior index cached locally, indicates which leaves on the site's data structure tree have been added, removed or changed.
In fact, users rarely would need to download entire indices because DRP allows them to download only incremental index updates, further reducing the network load. Even the most complex incremental downloads from a given site may require no more than a single TCP connection.
The fundamental beauty of DRP is that it can identify updated files efficiently without having to open, convert and catalog the files, generate full-text indices or otherwise muck around in the extensive file-related metadata that operating systems and document management software normally maintain.
DRP-compliant servers would use a hashing algorithm such as RSA Data Security's Message Digest 5 to compute a checksum for each file. The checksum would be based on the string of bits that compose the current version of that file, regardless of whether the file is in HTML, Portable Document Format, Microsoft Word or some other format.
Any modifications to the underlying file would produce a correspondingly different checksum. Indices would consist of Extensible Markup Language-formatted files that associate file URLs with checksums. (You can retrieve the current draft of the DRP proposal from www.w3.org/TR/NOTE-drp.)
Unfortunately, Microsoft has chosen not to support the DRP proposal. Instead, the Redmond, Wash.-based software giant has thrown its weight behind the Internet Engineering Task Force's (IETF) complementary, much more ambitious World Wide Web Distributed Authoring and Versioning (WEBDAV) standards effort and has disparaged DRP as a 'toy solution' to the problem of file replication on the Web. (You can find the current WEBDAV draft specification here.)
Microsoft's abstention from the DRP effort is shortsighted and self-defeating because, as even the company admits, DRP is a subset of the WEBDAV effort, which may take years to slog through the IETF's extensive comment and revision process. We need DRP - which has been market-proven in the context of Marimba's Castanet technology - as soon as Netscape, Microsoft and other intranet software vendors can support it.
Microsoft's refusal to support DRP also is antithetical to the can-do spirit of the Internet community, which propelled TCP/IP to commercial ubiquity back when Open Systems Interconnection proponents were content to duke it out in elite committees. WEBDAV, for all its complexity and sophistication, smells like OSI for the '90s.
It is perhaps a bit too ambitious in its attempts to turn the Web into a sort of distributed file system with its own file authentication, manipulation, locking, check-in/check-out and version-control mechanisms. WEBDAV looks like a tough five-to-10-year sell to a volatile industry in which many vendors will be lucky to survive into the next fiscal year. DRP, on the other hand, is the sort of standard that Web software vendors can implement in their next product upgrade cycle.
The sooner we implement DRP, the sooner we will be able to hold back the deluge of redundant Web downloads that threaten the stability of intranets everywhere.
Intranet managers should throw their united weight behind DRP. If users speak as one, even a vendor as powerful as Microsoft will swing our way.
