• United States

Protecting data throughout its life cycle

Mar 20, 200611 mins
EncryptionIT LeadershipMicrosoft

Data life-cycle protection is becoming just as important in New Data Center architectures as network security, but challenges abound.

After attackers gained access to personal information on 19,000 students at Carnegie Mellon University last April, and network administrators reviewed data policies, the university implemented new security-management controls around its Oracle databases. But when it came to protecting data extracted from a database, Joe Jackson, system architect at the Pittsburgh school, faces many issues.

“Controlling the utilization of unstructured data is incredibly challenging, because once that data’s out of the database, controls don’t work,” he says.

Centralized database security management and auditing is a good first step. But organizations should also protect the safety and integrity of data at other points.

“You’ve got to look at the who, what, when, where and hows of data protection: Who’s using it, what they’re doing with it, when and how are they accessing it, how it’s being used, when it comes back, and how it’s securely stored and archived,” explains Gary Clayton, CEO of Privacy Compliance Group, a data privacy consulting firm in Dallas. No holistic approach exists for protecting information from cradle to grave – that is, as it traverses desktops, the database, the network, on to remote users and business partners, then resting in backup and storage, analysts and users say. Those enterprises tackling the problem of data life-cycle protection are doing so in ways as unique as the organizations themselves.

Examining data life-cycle protections

One of them is Houston-based Halliburton, which started looking at data life-cycle protections in 2003. In the light of publicity around data leakage at Microsoft and other Fortune 500s, Halliburton executives began asking how to control the organization’s vast information resources. They questioned how much information the company had, where it resided and for what it was being used.

They quickly realized the task’s complexity. “Data goes far beyond the database, particularly when you’re looking at document and content management. Not only does it fall under management for internal users, but how are you separating controls for documents and files accessed from the Web or being sent in e-mail?” asks Mark Johnson, chief information security officer at Halliburton.

The Halliburton team has since investigated a variety of data-protection tools for e-mail, desktops and storage. These include Symantec’s Enterprise Vault e-mail storage archiving software (available because of the Veritas acquisition) and Microsoft’s Rights Management Services, which encrypts protected information on the desktop in Office applications and Exchange, and on file and print servers.

But Halliburton decided not to implement any of the tools it evaluated. It found the tools were not comprehensive enough, required too much intensive custom development to integrate into its enterprise infrastructure, and called for extensive retraining and education of employees, Johnson says.

Instead, Halliburton took the interim step of commissioning an outside firm to monitor the organizational networks and identify what information needs to be protected – primarily being intellectual property, customer and marketing information. Then they isolated those information sources to heavily-secured LAN segments where they monitor what’s coming into and out of them to determine anomalous-user behaviors. In addition, they’re installing a digital forensics tool, EnCase, to help them search for intellectual-property violations among user computers.

“It’s a shot in the dark, and there’s a limit on things you can search without violating employee privacy. But we’re trying to be creative in determining if something’s going outside of the controlled workgroups,” says Erin Buxton, global IT strategist and architect at Halliburton.

The where of data

Like Halliburton, most organizations start with data centralization, database monitoring and network auditing to look for potential use violations, says Chris Liebert, senior Yankee Group analyst. “Data security right now is being looked at from the user perspective – Who has access to what? What equipment are they accessing the data from – and then putting controls around that,” she says. “So vendors are using network technologies, including behavioral analysis, packet inspection, traffic-analysis sensors, heuristics and correlation to watch for unauthorized user behaviors.”

Google it

How search helps out with data life-cycle protection

New enterprise search engines should make it easier to locate and tag information sources requiring life-cycle protection. These include products such as Oracle’s Secure Enterprise Search 10g and IBM’s WebSphere Information Integrator, as well as third-party appliances running Google search engines.

“The power of the search engine is incredible,” says Marcus Sachs, who directs Homeland Security’s cyber security research out of SRI International. “If you’ve got keywords that appear in documents prone to information leakage, Google can find these words appearing on the desktop computers that they shouldn’t be appearing on. It also can show me whether or not those keywords have passed through my e-mail server,” he says.

While Google came under fire in a February Gartner report for Google Desktop 3, which dished up corporate search information to Google’s own servers, Sachs says that’s not a concern with third-party products using Google’s search engines to locate sensitive data. Such products restrict this from happening, as well as provide many other additional security and reporting features, he adds.

For example, Secure Elements in April will release a beta version of a Google search appliance, C5 Insight, allowing enterprise managers to ask “what if” questions of their infrastructures and system logs to determine patch levels, application vulnerabilities and other network-related risk information. (Sachs sits on the advisory board for Secure Elements.)

Deb Radcliff

The most important benefit of network and database application monitoring is the understanding an enterprise gains about the network, applications and the at-risk data housed within them, consultants and users say.

“Auditing databases on structured data isn’t new, but the issue is when you have such large environments, how do you reasonably figure out what auditable data you care about without the help of correlation,” Carnegie Mellon’s Jackson adds.

With the right network, application and user information, organizations can develop policies and procedures around encryption, access controls and monitoring rules. Then they use monitoring and correlation to look for signs of user misbehavior (for example, an attempt to download large customer files when a computer has not done that before).

Dave Giambruno, director of engineering and security for Pitney Bowes, in Stamford, Conn., says he has found the Holy Grail in IntuitiveLabs’ OPX correlation engine. OPX correlates, 120 million times daily, all reports coming from network devices and from security applications, including already correlated reports from Pitney Bowes’ engine that assesses application vulnerability, Application Security’s AppDetective.

“Companies run their security in silos, looking at network, application and data security as separate geographies. You need a combined view of all this network data that’s driving your risk,” Giambruno says. “From an information life-cycle point of view, you get clarity and manage risk in the context of all your communities of interest – the infrastructure and application teams, as well as the business-unit owners themselves, who can then understand the risks to their data and set rules around them.”

The who of data

OPX’s red, yellow and green rating system attesting to the state of protected applications is enough to satisfy regulators, Giambruno says. But regulators are revisiting guidelines to look beyond who has authorized access to data, contends Chris Sharp, vice president of security services for Verizon Business’ consulting services. Soon, he says, regulators will need to know all the places where data is stored, who’s interacting with the data and the means taken to protect that data throughout its life cycle. Other consultants and IT executives agree.

“Data protection is more than securing networks and devices,” Clayton adds. “Data integrity and protecting data from unauthorized use and destruction also is part of the information-protection life cycle.”

This level of data integrity and protection happens primarily at the network layer, in two ways: by monitoring network traffic for behaviors indicative of a data-policy breach (such as large files uploaded to a user computer late at night), or by inspecting outbound packets for protected data types (such as SSNs or CAD drawings).

“Say, all of a sudden, we’re seeing a user downloading 10 megabytes of peer-to-peer files over a two-hour period. We know that’s not normal,” says Herb Tong, network manager for the city of San Francisco.

PacketMotion’s PacketSentry real-time traffic analysis tool enables Tong to determine the file types being transferred and the IP addresses from which the download requests emanated. Once Tong has this information, he contacts the department supervisor regarding the situation.

Instead of anomalous behavior, Perpetual Entertainment, a computer gaming start-up in San Francisco, relies on the packet-inspection method to monitor outbound transmissions for protected intellectual property. It uses Tablus’ Content Alarm software, which looks for data types specified as protected and automatically can issue an alert, block the transmission or take other actions based on policy.

False positives, in which legitimate data is tagged and blocked because data sequences might too closely resemble those of a protected data type, can be a problem with packet-inspection technologies, analysts and enterprise executives warn. But Perpetual Entertainment has had no such problems over the past two years using Content Alarm to scan outgoing data in packet streams, says Mark Rizzo, vice president of operations at the company.

The how of data

Nor has Rizzo had any problems with Tablus’ Content Sentinel, which he uses to scan desktops.

“When I ran a distributed scan for at-risk personally identifiable information, I was amazed at what [Content Sentinel] found in recycle bins,” he says. “From that scan, I was able to alter policies on how human resources stores and transmits records.”

Now Rizzo wants to install Content Alarm DT, Tablus’ desktop data-management tool, to prevent the transfer of information marked as private to USB devices, CD burners and print servers (so it can’t be printed and carried out of the organization) and to prohibit the copying of such information into instant messages or e-mail. Network-based technologies cannot protect against these actions. He’s waiting for the software’s next release, which should support his implementation of Microsoft Distributed File System. That release is scheduled for this summer.

Other enterprises are also discovering this new breed of product that helps control how users interact with data at the desktop. For example, Employee Benefit Management Corp. (EBMC), in Dublin, Ohio, uses Liquid Machines’ EMail Control to enforce encryption, expiration and use policies (such as print, copy, distribute) on e-mail containing medical data sent between EBMC’s Microsoft Exchange server and its 60 employer partners. Besides Email Control, EBMC uses outbound traffic and database-monitoring tools to secure data, says Renee Haas, vice president of operations at the company.

To protect regulated data for compliance initiatives, medical-related companies, such as EBMC, often begin with the e-mail application, because that’s where most of the controls need to be, says Ed Gaudet, vice president of product management for Liquid Machines. “Equally important is intellectual property stored in spreadsheets and other Office applications,” he says. Liquid Machines also offers enterprise rights-management software for protecting data by enforcing encryption and use policies in applications, including Office, Visio and CAD.

Some start-ups, such as Blue Jungle, with its Compliant Enterprise, use agent technologies installed on endpoints to enforce information-usage rules on selected documents at the desktop, including copy/forward/print and e-mail, according to policy. This enforcement would be especially useful for putting controls around regulated financial data, says Halliburton’s Johnson, who ran a proof of concept on the Blue Jungle product last year. But the system doesn’t scale to fit his outside user population, particularly the thousands of suppliers, consultants and contractors with whom Halliburton shares intellectual property.

In some cases, data may be too critical to be allowed on desktop at all, Verizon’s Sharp says. Many high-end clients allow only temporary sessions between the database and desktop and scrub the cache each time a user session times out, he explains.

The when of data

The final place data needs protecting is in storage and back-up systems. This means encrypting sensitive information in e-mail archives and in tape, disk and online back-up systems. And to meet data-retention regulations, this also means storing the information for a set amount of time, putting controls around how quickly it can be recovered, then mapping what type of document, access and retention period is associated with the data, says Tom Dwyer, a Yankee Group research director.

Look to see this functionality coming out of partnerships between suppliers of enterprise content management and storage management, he says, pointing to the partnership between content-management vendor Mobius Management Systems and storage-management vendor EMC. Atempo’s LiveBackup, WysDM Software’s WysDM for Backups and others also perform similar data-protection functions in back-up, storage and archiving systems.

Clearly, information life-cycle protection won’t be easy. There are pitfalls with false-positives and information overload if monitoring and correlation technologies aren’t implemented correctly. Encrypting can bloat your information systems and burden your users. (Look to the storage-encryption standard IEEE P1619 to lighten the bloat, at least in data storage.) Only one technology vendor, Adobe, tightly controls desktop files, but only on its own file types. Yet, early adopters say it’s time to try to protect critical data from a life-cycle standpoint in today’s open enterprises.

Radcliff is a freelance writer in California. She can be reached at

Previous story: Setting the foundation for identity management | Next story: Mercantile exchange uses automated change management >