Archives
What's New
Site Map
Subscriptions

Home
NetFlash
This Week
Forums
Reviews/buyer's guides
Net Resources
Industry/Stocks
Careers
Seminars and Events
Product Demos/Evals
Audio Primers

IntraNet


Error 404--Not Found

Error 404--Not Found

From RFC 2068 Hypertext Transfer Protocol -- HTTP/1.1:

10.4.5 404 Not Found

The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent.

If the server does not wish to make this information available to the client, the status code 403 (Forbidden) can be used instead. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address.


















For more info:

"The HTTP Distribution and Replication Protocol", World Wide Web Consortium (W3C) Note, www.w3c.org

"A protocol for clogged intranet arteries", Network World, 10/6/97 www.nwfusion.com/forum/

RSA MD5 algorithm, www.rsa.com

NIST SHA algorithm, csrc.nist.gov

"A brief introduction to WEBDAV", www.ics.uci.edu


The World Wide Web Consortium ponders new standards that deal with content delivery protocol

By Mark Gibbs
Network World, 4/27/98

Tired of listening to users complain that it takes too long for Web pages to appear and sick of worrying about what impact this might have on intranet use? If so, you'll be happy to hear about the HTTP Distribution and Replication Protocol (DRP) submitted for the World Wide Web Consortium's (W3C) consideration.

DRP, which rides on top of HTTP, is designed to speed Web content retrieval. It does so by supplying the client with an index of the site's contents.

When a browser visits a site, it will obtain and store a copy of that site's DRP index in its cache. Then, for example, when the user returns to the site and clicks on an image link, the browser compares the local index entry for that image against the Web site's current index entry. If the entries match, the browser delivers the local copy of the image to the user. If the entries don't match, the browser will retrieve the updated image from the Web server.

The DRP index is a hierarchical list containing a fingerprint for each Web site element. Index entries can describe files or virtual content created by server-side scripts or database extracts. The fingerprints for the Web site content files are created using RSA Data Security, Inc.'s Message Digest 5 (MD5) or the National Institute of Standards and Technology's Secure Hash Algorithm (SHA-1) checksum. (Note that I say Web site content files rather than HTML and image files. This is because the DRP index contains the checksums of files, so the file type doesn't matter. DRP can work with HTML pages, images, Java applets and so on.)

Down to the nitty-gritty

A DRP index's format is based on the Extensible Markup Language (XML) standard recently approved by the W3C. XML is a human-readable system for defining, validating and sharing document formats on the Web. An XML index file for a Web site would look as such:

<?XML VERSION="1.0'' RMD="NONE''?>
<index>
<file path="home.html'' size="12345'' id="urn:md5:PEFjWBDv/sd9alS9BYuX0w==''/>
<file path="layer1.js'' size="32112'' id="urn:md5:W25YCu3toJt3ZsDsHIZmpg==''/>
<dir path="images''>
<file path="acme.gif'' size="4532'' id="urn:md5:+hbZN5XfU6QAJB1RFl/KSQ==''/>
<file path="banner.gif'' size="10452'' id="urn:md5:tr3X+oN3r9kqvsiyDSSjjg==''/>
</dir>
<dir path="java/classes''>
<file path="Scroll.java'' size="14323'' id="urn:md5:xjBkgWouS6p6FTUMIkx/Zg==''/>
<file path="gui.jar'' size="540321'' id="urn:md5:tcUzw0DKut3SiTpmpAsi8g==''/>
</dir>
</index>

Rather than getting bogged down explaining this file's XML framework, I'll stick to discussing the DRP components.

The tag pair <index> and </index> defines that the enclosed content is index data. An entry describing a file is: <file path="home .html'' size="12345'' id="urn:md5:PEFjWBDv/ sd9alS9BYuX0w==''/>. "File path'' can specify the explicit file name and path or a fragment thereof, "size'' provides the file size in bytes, and the "id'' field specifies use of a Universal Resource Name (URN) for this file (think of this as an alias) that is an MD5 checksum of "PEFjWBDv/ sd9alS9BYuX0w==''. All file paths are relative to the URL from which the DRP index file is retrieved. For example, the above entry might be in the index file http://www.gibbs.com/drpdemo/index.xml.

At present, the DRP index can be stored in any file in any subdirectory on the Web server. Later specifications will deal with defining a name or location for the DRP index.

So, in our example, the path to the index file is considered the base path for all subsequent file references. Thus, the URL referred to in the entry becomes http://www.gibbs.com/drpdemo/home.html. The index file also can explicitly change the base URL. Consider the following from our example:

</dir>
<dir path="java/classes''>
...
</dir>
</index>

The <dir> and </dir> tags indicate a change in the base and the <dir path="java/classes''> specifies that all file paths in the section are modified to the default URL (the context of the index file) with the path "java/classes'' appended. That means the URL http://<http://www.gibbs.com/>www.gibbs.com/drpdemo/java/classes/ would be the new base.

Frugal and differential

To ensure that even the access to the index is as frugal with bandwidth as possible, DRP supports differential GET requests and differential index retrieval.

When a client that can perform differential GETs (browser modifications are required) determines that a file on the server has changed, it will send a special form of the HTTP GET request. This differential GET supplies the id of content the client has and the id of the content that it wants.

The server computes the differences between the two versions and transfers them using whatever differencing format, such as Marimba, Inc.'s Generic Diff Format, the client can accept. The "accept'' HTTP header supplied by the client as a standard part of Web communications specifies which differencing format the client can support.

Up to this point, the DRP index files require no server-side processing: The index file content only needs to be updated when files are changed, added or deleted. But differential operations require that the server analyzes the difference between versions of the requested index or content file. This turns a DRP implementation into a more complex operation.

DRP benefits don't stop with clients such as browsers. Caching servers and proxies also benefit from DRP.

Caching servers and caching proxies will benefit by reducing the number of network retrievals needed to keep their cached Web content current.

Proxies that don't support caching but do support DRP will improve performance for whole networks by providing DRP data for commonly accessed content to all clients. DRP-enabled proxies and caching servers will, in effect, provide better response for non-DRP-capable clients by handling DRP for them. But what happens when you have a DRP-enabled client accessing the Web via a non-DRP-enabled proxy or caching server? Well, because DRP is overlaid on HTTP, the DRP index data will be transferred transparently to the client through the non-DRP-enabled system. It looks like just another element transferred using HTTP.

Fly in the protocol

DRP promises to be a useful and practical protocol, but it could be some time before the W3C moves to standardize it. DRP is only in the W3C's Note stage, which effectively means it's in limbo until enough of the organization's members want it elevated to a working draft. Vendors now backing DRP include its creator Marimba, plus Netscape Communications Corp., Sun Microsystems, Inc., Novell, Inc. and At Home Corp.

Interestingly, Microsoft Corp. supports an alternative - the IETF's World Wide Web Authoring and Versioning (WEBDAV) protocol, a far more ambitious and complex undertaking than DRP. WEBDAV is intended to cover the editing of metadata, such as content author and creation date; name space management, such as listing, copying and moving Web pages; multiuser support; and version management. Getting consensus on such a range of technical issues will be difficult, meaning WEBDAV could take years to mature. It won't likely meet short-term needs for a reliable and robust distribution and replication service.

Without doubt, you'd be better served by pushing for DRP's quick development and implementation rather than floundering around waiting for WEBDAV's "jam tomorrow.''


Feedback | Network World, Inc. | Sponsor Index
Marketplace Index | How to Advertise | Copyright

Home | NetFlash | This Week | Industry/Stocks
Buyer's Guides/Tests | Net Resources | Opinions | Careers
Seminars & Events | Product Demos/Info
Audio Primers | IntraNet