Open source database improvements grow


New open source databases users seem to blend the fervor of religious converts with the hardheaded realism of IT professionals.

"I needed an inexpensive database that could handle millions of records and generate [query] results in as short a time as possible," says Rich Allen, voice/data traffic coordinator, at Matanuska Telephone, an independent telco in Alaska.

He replaced flat text files and the Filemaker application with an open source version of MySQL .

"In addition to being free, and robust enough, it is also the most stable application I have ever used," Allen says. "MySQL is running on a dozen different Mac OS X servers and has never failed in the three years I've been using it."

The open source software is taking care of the most critical data for the telco: subscriber inventory for each of 52,000 access lines, billable call record data and traffic logging.

Allen's experience is typical. Open source databases often still are used in specialized niches. But they are important, even vital, niches for a growing number of corporations: Web portals, e-commerce applications, high-speed Web searching, content management, and most recently, data warehouse reporting.

Consider what's happened with these databases:

• Use of MySQL grew more than 30% in 2003, according a database survey by Evans Data. In the same period, use of Microsoft SQL Server and Access grew just 6%.

• PostgreSQL 7.5, due out around June, will run on Win32 platforms for the first time, offer a passel of performance improvements, partition data more efficiently, and might include support for two-phase commit, which is vital for transaction processing.

• MySQL next month will unveil new software to cluster database servers, so applications keep running if one server fails.

• February saw the release of Version 1.5 of Firebird, which is based on Borland's short-lived public release of the venerable Interbase source code in 2000. A key change is shifting the code to C++ in preparation for an array of enterprise-related improvements being hammered out for Firebird 2.0.

A nice mix

The mix of developers, consultants and some vendors in the communities that create and extend these databases are moving between adding features that make these open source applications more reliable, and trying to avoid the panoply of elements that make commercial databases such as Oracle or Microsoft SQL Server complex and demanding.

Increasingly, these databases are being seen as part of a package, or stack, of open source software that can create an application infrastructure for corporations. The initial version of the stack was dubbed LAMP, for the Linux operating system, the Apache Web server, the MySQL database, and either PHP, Python or Perl as the development language. PostgreSQL boosters have been promoting what they call a "brighter LAMP," which is Linux, Apache, middleware (such as Java application servers and messaging) and PostgreSQL. The effort reflects the consensus that PostgreSQL is better suited to large-scale, high-volume applications.

"Smaller companies want a simple [application] solution, with no licensing fees, which they can get up and running quickly," says Fred Moyer, a founder with his partner of, a consultancy specializing in open source database applications based on PostgreSQL. The open source stack lets him do all that, and he can deploy ready-to-use application modules, written in Perl, from sites such as

Moyer is working with a few large companies that are evaluating PostgreSQL as a potential replacement for some of the Oracle databases they currently use. "Not everything they need is there yet [in PostgreSQL]," he says. "But it will be during the next six to 24 months."

The Robert Frances Group, a market research firm, recently completed a study on ROI for Linux deployments in corporations. "We found that application 'owners' are more willing to look [farther] up the stack for open source deployments, to consider application servers and databases," says Chad Robinson, senior business analyst with the firm.

It's easier to treat open source databases as part of a software infrastructure because developers are adding the features needed for that role.

MySQL, the U.S. arm of MySQL AB in Sweden, will release next month at its annual user conference details of new database clustering software. The company just acquired the software from Ericsson, which had started the project to let applications riding its cellular hardware shift from a failed database server to a backup without losing data or crashing.

The clustering software will be an additional product from the company, and like the MySQL database itself will be available either under an open source license, the GNU General Public License or under a commercial license.

The new software is part of an effort to make the MySQL database indispensable in critical applications, such as online air ticket fare searching as users search Sabre Holdings and Travelocity. "It will cause people to look at MySQL in a whole different light," says Zack Urlocker, vice president of marketing for MySQL.

In the past year, the database has added support for transactions and stored procedures and other enterprise features, all of which have been standard on commercial products for years.

PostgreSQL 7.5 is due out this summer, with the major change being a port for Win32-based operating systems, says Josh Berkus, one of five members of the PostgreSQL Core Team that acts as project administrators for the development work. Currently, the database only can run on Windows operating systems via an emulator, which limits access to a range of operating system features.

PostgreSQL traces its roots to Ingres database project at the University of California at Berkeley in the mid-1980s.

Other changes in 7.5 will include:

• A new memory management algorithm to boost performance for big databases with lots of user activity.

• Table spaces to simplify storing data in specific disk location, called partitions, which lets you create big databases that still have fast performance.

• Two-phase commit, which controls updates to two or more database at once during an online transaction.

Firebird 1.5 shifts the source code from C to C++, along with a big cleanup of the code, new memory management improvements and numerous bug fixes. Another big change has been several enhancements to the SQL query optimizer. Users report queries now run 30% to 60% faster, and in some cases even faster.

Speed, simplicity

Users on the's Firebird site and other Internet sites report they like the compact size of the database, its support for Java, its speed, its simplicity and its straightforward installation on Win32 computers.

The new release is the foundation for what is expected to be substantial innovations in Version 2.0, especially in performance and security. Users are pushing for better support for symmetrical multiprocessor servers and expanded SQL operations.

Many of these changes have long been standard features of the commercially licensed databases. User application's requirements determine which open source database to use, or even whether to use one at all.

Compiere is an open source ERP/CRM suite, which has stayed in the top 10 list of most downloads for a good part of the past two years at, a Web site for open source development projects. There have been more than 630,000 Compiere downloads, according to Jorge Janke, one of the Compiere project administrators.

But the suite is not wedded to an open source database. The Compiere team ran into some limitations in its first effort to make the software work with PostgreSQL, he wrote in an e-mail. MySQL lacked a feature set the developers deemed necessary. The goal now is to make Compiere "database independent," he wrote.

