Search /
Docfinder:
Advanced search  |  Help  |  Site map
RESEARCH CENTERS
SITE RESOURCES
Click for Layer 8! No, really, click NOW!
Networking for Small Business
TODAY'S NEWS
Four reasons to buy (and one reason to avoid) the Droid
Cisco MARS shuts out new third-party security devices
Verizon Droid buzz muted in Boston
Week in Google news: Google Dashboard, Droid fever, focus on e-commerce
Cloud computing, virtualization proponents getting antsy
Data center start-up offers energy saving software
Vendors scrambling to fix bug in Net's security
Judge dismisses lawsuit challenging Gartner's Magic Quadrant
Boston Celtics clamp down on spam
Cloud computing inevitable? Not so fast, educator says
Blue Coat slashes staff, buys S7 services company
Apple seeks new sheriff to lock up iPhones
Google releases new search engine for e-commerce sites
Rackspace apologizes for cloud outage, prepares to issue service credits
Applications /

Script: XML audio primer

Related linksToday's breaking news
Send to a friendFeedback

Since its debut in 1996, there's been a lot of hype around the Extensible Markup Language's (XML) ability to easily store and share structured data across multiple platforms. In this primer, we'll take a look at the technology, including its history, benefits and drawbacks. And provide examples of how it is being used.

XML's roots actually go back to the early 1980's when the Standard Generalized Markup Language, or SGML, was introduced. SGML is mainly used to help organize and define large technical manuals and documentation. It uses a rigidly defined and complicated system to help give technical documentation more structure. XML was designed to be more flexible and lightweight than SGML - it can be used to define just about any type of structured data.

The World Wide Web Consortium, the group overseeing the XML standard, defines the language best: XML is a set of rules for designing text formats that lets you structure your data. In doing this, you make it easier for computer to output, read and interpret those data.

Like HTML, XML uses tags to delimit and define pieces of text and data in a document or file. A key difference, however, is that you define what the tag means. A <title> tag in HTML is always used the same way in all HTML documents, but a similar tag in XML could be used to store the name of a book , a vehicle title number or even somebody's professional status. The specific meaning is by the application reading the data through a data-type definition or an XML schema format.

Another key attribute of XML is that the tags do not define how text or data should be displayed. This makes XML potentially valuable for outputting documents to multiple devices or formats, for example, Web browsers, wireless devices and the like. Each is free to use its own specifications to display the information - where a Web browser might start with a giant-sized headline, a wireless phone might display a small headline.

XML is also very particular about tag construction. Attributes must be surrounded by quotation marks and tags must always be closed for an XML document to pass muster - or b considered "well formed." Compare this to the loser HTML world, where most browsers will let you get away with considerable sloppiness.

The tagged XML data is stored in a plain text file, unlike standard databases that uses binary and other proprietary formats. The benefit of text being that any type of computer operating system can read a plain text file. On the negative side, the text method can use more disk space and bandwidth when storing and transmitting XML data. However, XML's creators took into consideration that storage is cheap and text can be compressed relatively easily. Transmission is simple as well, using the standard HTTP protocol to exchange XML-based data across a network.

Creating XML-defined data is relatively easy. For instance, a simple sales lead database would need the tags name, title, company, address, phone and e-mail. A Web form could be used to collect the data and a Perl script to convert the input into an XML format. A parser that uses the information to print mailing labels for sending out brochures to the perspective client could then read the data collected. All of this could happen without human interaction.

But that ease of creation is also one of the language's drawbacks.

For the machine to read the data, a parser needs to be built that knows what each of those tags defines. It gets even more complex when two companies or Web sites want to share data - each might have their own definitions for a given tag or set of data. To a human, it is easy to open the file and understand what is written. XML files can be human readable, but they're meant for machine-to-machine exchange. The difficulty is exchanging data between companies that may have different ways of describing the same data.

Company A and Company B may both use XML but may not be able to share information directly. Company A may use the <name> tag to describe the name of person while Company B could use the same tag to mean the name of a product. The two companies must first agree on how to define the data that will be tagged, so a tag called, say, <name> will mean the same thing at Company A as it does at Company B.

To make data exchange easier between companies in a particular industry, various groups such as the Organization for the Advancement of Structured Information Standards (OASIS) are creating XML Schemas for vertical markets. The schema defines shared vocabularies between the two companies and provides rules for processing data.

One example of an industry wide specification is EbXML used in the e-business world. Developed OASIS and the United Nations' Center for Trade Facilitation and Electronic Business, EbXML is a modular suite of specifications designed to enable companies of any size and in any country to conduct business over the Internet through the exchange of XML-based messages.

Another commerce-related standard is the IXRetail digital receipt schema. IXRetail provides a methodology for obtaining sales transaction information from multiple applications, including point-of-sale, inventory and customer service, and making it available in electronic format. The digital receipt can be used by consumers and suppliers that want to electronically track their in-store purchases and transactions. Eventually, consumers will be able to have receipts beamed to their PDA directly from the point-of-sale terminal. The Association for Retail Technology Standards and ActiveStore, a Microsoft-led retail technology initiative, co-developed the IXRetail schema.

A broader example of an XML-based exchange format is the Resource Description Framework (RDF), used for describing and exchanging metadata electronically. Such data can include library catalogs and directories as well as for aggregated news, software and other content across the Internet. For instance, Network World uses RDF files to share news headlines with other Web sites. A Web site can automatically read and publish Network World's headlines without human interaction using RDF.

Slow adoption is part of XML's drawback. Not many database vendors are supporting the specification and it can be difficult to migrate existing data stores into an XML-defined format. Also, industry-specific schemas have been slow to mature as companies argue over definitions.

Progress is being made though. XML is at the heart of many Web services, middleware that connects disparate applications using standard protocols and technology, including two Microsoft-led initiatives: Simple Object Application Protocol (SOAP) and Universal Description, Discovery and Integration (UDDI).

SOAP is used to encode and describe messages sent between applications. SOAP can be used to pass messages between any type of application, but it is mainly used for making remote procedure calls. SOAP can run over any network protocol, but is mainly used with HTTP. Because SOAP is XML-based, it can be used to communication between two applications running on different platforms and written in different languages.

UDDI is an online yellow pages developed by Microsoft, Ariba and IBM. The UDDI registry promises to make it easier for businesses to provide information about their products and services on the Web as well as locate partners and customers. It provides a standard set of protocols for exchanging the information contained in the registry.

There are many other parts to the XML family that make it more usable. XHTML combines HTML and XML to create a method for creating Web pages that can be viewed on any device. A more rigid version of HTML, XHTML-based pages can be viewed on a high-powered PC as well as a small handheld using the same code.

Extensible Stylesheet Language (XSL) is used to describe how to display an XML document of a given type. XSL is similar to Cascading Style Sheets used for creating uniformly formatted HTML pages. XSL can be used to render a document in print, on the screen or even audio.

Namespaces in XML are used to define where XML-based information is coming from and is used to avoid confusion over the source of the data. If two companies are publishing similar information, the Namespaces convention can delineate what information belongs to which company.

For companies creating a new data structure from scratch or looking to more easily share information electronically with partners and customers, XML may be the way to go.

Main XML audio primer page

More on XML and Web services

RELATED LINKS


NWFusion offers more than 40 FREE technology-specific email newsletters in key network technology areas such as NSM, VPNs, Convergence, Security and more.
Click here to sign up!
New Event - WANs: Optimizing Your Network Now.
Hear from the experts about the innovations that are already starting to shake up the WAN world. Free Network World Technology Tour and Expo in Dallas, San Francisco, Washington DC, and New York.
Attend FREE
Your FREE Network World subscription will also include breaking news and information on wireless, storage, infrastructure, carriers and SPs, enterprise applications, videoconferencing, plus product reviews, technology insiders, management surveys and technology updates - GET IT NOW.
* HOME    * RESEARCH CENTERS     * NEWS     * EVENTS

Contact us | Terms of Service/Privacy | How to Advertise
Reprints and links | Partnerships | Subscribe to NW
About Network World, Inc.

Copyright, 1994-2006 Network World, Inc. All rights reserved.