Search /
Docfinder:
Advanced search  |  Help  |  Site map
RESEARCH CENTERS
SITE RESOURCES
Click for Layer 8! No, really, click NOW!
Networking for Small Business
TODAY'S NEWS
Heartbleed bug is irritating McAfee, Symantec, Kaspersky Lab
Server makers rushing out Heartbleed patches
6 Social Media Mistakes That Will Kill Your Career
4 Qualities to Look for in a Data Scientist
Big bucks going to universities to solve pressing cybersecurity issues
Mozilla appoints former marketing head to interim CEO
Box patches Heartbleed flaw in its cloud storage systems
Obama administration backs disclosing software vulnerabilities in most cases
6 Amazing Advances in Cloud Technology
Collaboration 2.0: Old meets new
Data breaches nail more US Internet users, regulation support rises
With a Wi-Fi cloud service, Ruckus aims to help hotspot owners make money
How to get Windows Phone 8.1 today
Secure browsers offer alternatives to Chrome, IE and Firefox
10 Big Data startups to watch
Big data drives 47% growth for top 50 public cloud companies
Here are the options with Heartbleed-flawed networking gear (Hint: there aren't many)
Akamai admits its OpenSSL patch was faulty, reissues keys
Second Google Glass user attacked in San Francisco in two months
Microsoft puts the squeeze on Windows to shoehorn it into 16GB devices
An unnecessary path to tech: A Bachelor's degree
Heartbleed Bug hits at heart of many Cisco, Juniper products
iPhone 6 rumor rollup for the week ending April 11
/

Working out the bugs in XML databases

There's a growing belief that XML-based information needs its own database.

Today's breaking news
Send to a friendFeedback


As network executives begin to experiment with Web services, they're likely to find that they need a new kind of data store: the XML database.

These software products are designed to efficiently store and manage the growing numbers of XML documents that users are creating, especially in Web interactions with business partners and customers. Advocates cite several advantages of XML databases compared with traditional databases: simplicity, ease of application development, ability to search and query XML documents, and fast document retrieval.

There's no formal, standard definition of an XML database, although the XML:DB Initiative (www.xmldb.org) describes such a database as one that defines a logical model for an XML document (not for the data in the document), and manages documents based on that model. The key point is the database "thinks and acts" based on XML - XML goes in, and XML comes out, even though these products can physically store the documents in an object or relational database or a proprietary storage model, such as indexed files.

The lack of formal definition is just one issue that raises the hackles of critics. They also point to the immaturity of the products and of XML standards; the absence of a standard, reliable query language to match the SQL used in relational databases; and possible data integrity problems.

Relational vendors are also adding better support for XML. For example, Microsoft is developing the Yukon release of SQL Server. Oracle demonstrated to customers in December a technology called Project XDB. The goal of both projects is to let the databases treat XML documents as a new data type and manage them as they now work with relational data and objects.

"If I had an Oracle [relational] database, I'd want to really know what's going in the background to handle XML," says Larry Hanson, data architect for the California Board of Equalization (BOE), a tax authority that handles sales and other taxes for the state. "If you store these documents as objects, for example, can you query them, and tag them?" Oracle claims that these actions will be possible with XDB but how well the technology performs when processing lots of data or very large data sets remains to be seen.

Hanson's point, echoed by others, is that XML data is fundamentally different from relational data.

"XML data are extremely well-suited to hierarchical storage," says Hanson, who is a former database administrator. "In XML databases, an online tax return can be stored in its entirety. In a relational database, each line of the return would have to be a different table [of data in rows and columns]."Trying to "force fit" an XML document into the rigid relational structure can waste storage space and lead to inefficiencies in queries and retrievals.

Analysts expect these benefits to fuel a fast-growing market. IDC estimates enterprise spending for XML databases will grow by 130% annually, reaching $700 million in 2004. XML databases will complement relational databases, according to IDC analyst Anthony Picardi - the former being better suited for storing and processing XML documents, the latter for numbers and text.

There are plenty of choices for network executives to evaluate, with at least two dozen native XML database products (see XML Database Products).

The key vendors include Software AG and eXcelon - which stores documents in its ObjectStore object-oriented database. There are a host of smaller vendors, such as NeoCore, IXIA and ZYZFind, working on XML database products. There are also a number of open source projects. One is Xindice, formerly dbXML Core, which now is being handled by The Apache Software Foundation.

Knowing whether and when to use a native XML database hinges on the kind of data you're dealing with, and what you want to do with it.

Companies are finding that new applications such as Web services, which are built on XML, tend to have data models that don't map well to traditional relational structures, says Philippe Gelinas, CEO of software developer Xiasoft, which developed the TextML Server for XML documents.

The server is designed as a low-cost product - about $10,000, while some rivals cost about $50,000 - that can work with an array of development tools.

"Often customers try to make these applications work first with an existing [relational] database and find it doesn't work," he says. "Then they shop for an XML database."

Some users, like California's Hanson, are early adopters, already convinced of the importance of XML to the corporation. Two years ago, Hanson began designing an alternative to paper tax returns: filing electronically via a Web site. The tax data had to eventually end up in the mainframe database, the venerable Adabase from Software AG.

But the two options for that each had drawbacks. With the first option, if XML documents were stored in Adabase as huge binary large objects, as images and sometimes text are stored in relational databases, then the documents became opaque. They could not be searched or queried.

The California BOE was already doing some work with the second option: The documents are picked apart by a parser program, and the data sent to the mainframe in a form Adabase can use. But this creates more processing overhead, and changes to the documents, such as adding a new line to the sales tax form, would force administrators to make changes to the underlying database structure.

Hanson deployed Software AG's Tamino XML database. The XML documents created by tax filers at the Web sites are stored in Tamino.

The subset of data needed by the mainframe is parsed out.

The entire unmodified sales tax filing, and all of its data, is stored in Tamino, where BOE users, working with a Web browser, have begun querying the data and creating management reports.

"Once people move into XML, they'll run into the same thing we did," Hanson predicts.

"If you're getting the XML document instead of paper, where do you put it? How do you store it, and what are you going to do with it?" he adds. In the long term, his goal is to let users have a combined view of all data, in XML and traditional databases, through a Web browser.

Achieving that goal is not easy as the weak points for XML databases are numerous. The user interface for new products may be rough. In the case of the California BOE, data administrators had to write extra code to update Tamino and the mainframe database. Queries are a challenge because there are several different XML query languages, and these are still in flux. Finally, integration between XML and corporate data stores requires still more coding at this early stage. n

RELATED LINKS

Contact Senior Editor John Cox

Other recent articles by Cox

XML:DB


NWFusion offers more than 40 FREE technology-specific email newsletters in key network technology areas such as NSM, VPNs, Convergence, Security and more.
Click here to sign up!
New Event - WANs: Optimizing Your Network Now.
Hear from the experts about the innovations that are already starting to shake up the WAN world. Free Network World Technology Tour and Expo in Dallas, San Francisco, Washington DC, and New York.
Attend FREE
Your FREE Network World subscription will also include breaking news and information on wireless, storage, infrastructure, carriers and SPs, enterprise applications, videoconferencing, plus product reviews, technology insiders, management surveys and technology updates - GET IT NOW.