Linux delivers for U.S. Postal Service

Linux-based scanning machines speed mail sorting in 250 facilities.

While many businesses are just now turning to Linux as a server platform, the technology has delivered for the U.S. Postal Service for several years.

While many businesses are just now turning to Linux as a server platform, the technology has delivered for the U.S. Postal Service for several years.

The Postal Service has used penguin power since 1999 to streamline the "snail mail" process. More than 900 Linux machines currently sort in excess of 670 million pieces of mail per day in the Postal Service's 250 mail-sorting sites around the country.

"Linux has been working well for us for some time now," says Jasbir Sandhu, electronics engineer at for the USPS, who oversaw this automation project at the USPS. "It's very stable, and the cost is excellent."

Computers have sorted mail at the Postal Service since the 1980s, when electronic scanning systems were first installed. Those systems were based on proprietary optical character recognition (OCR) hardware that was controlled by a Digital Equipment VAX system.

This improved efficiency at the Postal Service, Sandhu says, but the systems still only handled about half the mail that came into the Postal Service's facilities. By the mid-1990s, "we needed a system that did a better job than that," he says.

Part of the problem was that the old system was difficult to modify and upgrading it was expensive because the computers used many hard-wired components for running OCR algorithms.

"That's why we went looking for a software-based system instead," Sandhu says. "We looked at some Unix-based systems, but they were too expensive in terms of licensing. As for Windows, there weren't any OCR applications that could be ported to that environment."

The Postal Service put out several bids and chose Pacific Northwest Software as its integration contractor. That's where Linux technology came in.

"The decision to use Linux was fairly straightforward," says John Taves, principal with Pacific Northwest, who was involved with the USPS project.

"A big plus with Linux is you don't have to worry about licensing," Taves says. "When programming with a commercial operating system, you usually can't get the source code, and you can't talk to the developers. With Linux, if there are questions or problems, you just get into a chat room, there's usually someone who will help you work things out."

The low licensing costs and the ability to develop Linux code quickly allowed Pacific Northwest to deliver a system that fit the USPS's needs and budget, Taves says.

"I had never heard of it before," Sandhu says. "But when we were introduced to [Linux], it looked perfect for our application."

Pacific Northwest and Sandhu's programmers did some tests and custom programming to fine-tune the OCR software on Linux. Then Sandhu's group began rolling out the machines to the Postal Services distributor centers.

At the Postal Service's 250 mail processing sites, outgoing mail is sorted and printed with numeric Zip Codes and bar codes. This allows mail to be automatically scanned and sorted at other sites. Mail at the processing sites is fed into conveyer belts which run the mail under the OCR scanners. The Linux boxes process the scanned images and convert them into characters which are processed by another machine into a Zip Codes, then send a TCP/IP-based message to another device down the line which prints the code and a bar code onto the letter or package. (All equipment on the processing site floor is on an IP/Ethernet-based LAN). Pieces that cannot be read by the system are inspected by Postal Service personnel and tagged manually.

The Linux boxes convert the scanned data into ZIP codes and then send a TCP/IP-based message to another device that prints the code and a bar code onto the letter or package. (All equipment on the processing site floor is on an IP/Ethernet-based LAN.) Pieces that the system can't read are tagged manually.

The Linux OCR machines are actually clusters of four or five networked PCs, with one acting as the master and the other three or four as slaves. The PCs are generic, Intel-based machines that consist of a Pentium processor, RAM, a hard disk and network card - no monitor, mouse or keyboard. There are about 900 deployed through the system.

The previous system handled only 50% to 60% of the mail that came into the distribution centers, with a ZIP code error rate of 5% to 6%, according to Sandhu. This slowed mail delivery because more mail had to be hand-sorted and tagged, he says. It also cost the Postal Service more money because more employees were required per site.

The new Linux systems can scan 70% to 80% of the mail received at each processing plant. Each cluster can process around 10 to 12 mail pieces per second, but with an error rate of less than 2%.

Not only are the Linux PCs faster and more accurate than the previous system, but the hardware is easy to replace and the software is simple to maintain, Sandhu says.

"We keep a few spares at all of our sites," he says. "It's all commodity-based hardware that's cheap and easy to get."

Sandhu says each Linux PC cluster cost about $10,000. Comparable OCR scanning and sorting systems he looked at cost as much as $50,000 per unit.

After having the Linux-based OCR system in place for almost five years, Sandhu says he has followed Moore's Law as the Postal Service gradually upgrades to a second generation of Linux machines.

"The systems we installed in '98 are considered pretty slow now," Sandhu says. With the gigahertz-level machines available today, the Postal Service can replace multiple Linux-based CPUs with one or two new PCs as it upgrades, he says.

Because the Linux OCR systems are open-source-based, Sandhu says it is easier to find and train programmers to maintain and tweak the software.

"It's not a closed system, and there's not special programming language to learn," he says. "We can bring anyone up to speed on it pretty quickly."

Copyright © 2004 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022