Design e-mail archives beyond legal discovery

Corporations are racing against time to create archives that allow retrieval of e-mails in response to increasingly common civil suit discovery motions.

Losing that race can be costly, as Morgan Stanley learned when it lost $1.44 billion -- most of that as penalties assessed by an annoyed judge -- in the Sunbeam case, after it failed to produce e-mails required by the court.

E-mail is becoming a standard part of civil discovery. Opposing counsel is well aware that e-mail archives are rich veins to mine, and last year unstructured data was specifically added to the discovery list in Rule 26 of the U.S. Federal Rules of Civil Procedure. And the threat isn't confined to Global 2,000 corporations. Any organization can face a civil suit, and any civil suit can include the requirement to produce relevant e-mail. CIOs in organizations that are as yet unaware of this legal trend should alert their CEOs to the danger.

For most enterprises today, the challenge is to find the requested e-mail at all. In most organizations the archive is little more than a pile of forgotten tapes originally intended as backup in case of a computer crash. How many IT organizations today can even find the tapes from two years ago? In the Sunbeam case, Morgan Stanley annoyed the court by dribbling in tapes as it found them, sometimes in closets and other forgotten corners in various offices nationwide.

However, say the experts at, even as enterprises scramble to find those lost tapes and create a real archive that will allow them to produce sets of historic e-mails if required, they should think beyond this purely defensive requirement to active use of the archived data.

E-mail contains valuable data

"Most unstructured data has very little metadata associated with it," said co-founder David Floyer at a public Peer Incite meeting held by the organization last week. "E-mail is an exception. E-mail headers contain a great deal of useful information about who is talking to whom and how organizations really work.

"There is a huge requirement for automated data classification for semantic search," he said. "These are essential for organizations to protect themselves."

For example, he says, the human resources department may need to search the archive for early evidence of emotional or sexual harassment, sales may want to search for indications of contract abuse, and purchasing can search for evidence of employee theft. Executives then can intervene early to avoid problems before they reach the stage of civil or criminal action.

Martin Tuip, manager of business development at e-mail archiving house Mimosa Systems and a meeting participant, suggested that analysis could go far beyond the initial concern over legal issues to supply insight into how the organization actually works, who influences whom and what the unofficial lines of communications are.

This could provide insights, for example, into why some projects succeed while others fail. He envisions organizations creating a single repository of unstructured data that can then be mined by business applications as needed for information.

This will require a new kind of archive, said consultant and Chief Content Officer Peter Burris. "People think of archives as inactive. We are proposing that in the future, archives will be active systems. Various kinds of applications will run on top of that archive," he says.

Those applications will evolve quickly, Floyer suggests. He envisions a two-tiered architecture, with a long-lived infrastructure consisting of the archive itself along with its associated hardware and data capture and semantic classification software. The business applications will run on top of that as a separate layer to provide the flexibility to support their rapid evolution.

This, Martin predicts, will result in a split in the archive administrator's function, which Burris compared to the split in the data management function between the data and database managers. In this case, archive administrators will focus on the underlying archive, while a new function -- which co-founder David Vellante called "records management" and said would be a business-facing rather than an IT technical position -- would evolve to manage the business applications layer.

New opportunities

This will become a complex new area of activity, with important legal requirements, fast-evolving business needs and potential for fast expansion beyond e-mail to other forms of unstructured data ranging from word processing documents and spreadsheets to Web pages, Floyer says. And with that likely expansion will come an explosion of metadata associated with those documents, driven in part by a need to satisfy expending discovery requirements in civil and criminal actions and in part by such concerns as documenting and improving user experiences on corporate Web sites.

This, Floyer says, will create a major new opportunity for outsourcers. "Many organizations will not want to deal with all this complexity internally," he said. "They will be looking for service providers who can guarantee their methods will meet legal requirements and support business users.

"E-mail is the lifeblood of most organizations today," Floyer said. "Archiving the institutional information contained in it and other forms of unstructured data will provide huge potential value beyond what it will save in potential court-mandated fines for failure to produce mandated documents in a timely manner. As IT organizations race to create their initial archives, they should keep the vision of the ultimate active archive in mind and build consciously toward that goal."

Bert Latamore is a journalist with 10 years' experience in daily newspapers and 25 in the computer industry. He has written for several computer industry and consumer publications. He lives in Linden, Va., with his wife, two parrots and a cat.

This story, "Design e-mail archives beyond legal discovery" was originally published by Computerworld.


Copyright © 2007 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022