• United States

The central nervous system of a data center

Aug 17, 20043 mins
Data Center

* What's behind the log messages

Almost all components of a data center such as network devices, storage systems, servers and even the electrical systems generate logs. Like a central nervous system, log messages are generated by these devices using the syslog or SNMP protocols apprising us of various activities and events at the hardware level (for example, data link down, fan failed), operating system level (disk full, too many files open) or at the user level (user logged in, incorrect password).

To take advantage of this flow of information, data center architectures must include the necessary infrastructure to collect, filter, analyze, correlate and archive the log information. Data center managers can put log data to many uses such as troubleshooting, security monitoring, network management and regulatory compliance. The operations and security groups will use logs in slightly different ways, so the infrastructure has to be flexible enough to support these different uses.

For security monitoring, the emphasis is on finding the “needle in the haystack” through filtering and correlation in near real-time. As thousands of messages stream in from various devices, the log management infrastructure must be able to prioritize and alert on the most important messages without overwhelming the operators. The term “Security Information Management” is used to describe a range of products from vendors such as Arcsight, Netforensics, Intellitactics and Micromuse, which provide this type of functionality. These products specialize in intelligent filtering and the management of large volumes of messages to provide selected alerting with a focus on security.

Network monitoring and troubleshooting require sophisticated filtering of log data, with the addition of root-cause analysis. HP OpenView, IBM Tivoli and other such products focus on providing an overview of the data center and network to assist the operations group in maintaining availability and performance. Root-cause analysis is the process of correlation that allows operators to identify the underlying event that may have triggered a wave of log messages across the infrastructure.

Regulatory compliance has become an increasingly onerous responsibility for many enterprises, especially in the health and financial services sectors. Regulations such as Sarbanes-Oxley, the Health Insurance Portability and Accountability Act and Gramm-Leach-Bliley Act stipulate that enterprises must monitor their infrastructure and retain audit data on the activities of their users and administrators. By securely collecting and archiving logs for long-term storage, enterprises can meet many of these regulatory requirements. To comply with the regulations the emphasis must be on collecting “raw” log data (without filtering) and on long-term archival. The log archives must be securely stored and searchable for forensic and investigatory purposes. Vendors such as Addamark and Loglogic offer products focused on archival for compliance purposes.

A comprehensive log management infrastructure must necessarily include products from different vendors to address each of the possible uses of the log data. Data center managers who are building a log management infrastructure must take into consideration the various end-users (network operations, security operations, audit group etc.) of the log data and ensure that the infrastructure addresses their needs.