Open source Hyperic offers an alternative for VM management

Project supports deep monitoring for VMware environments

Overall, we have to say that HHQ is a really strong “meta” console for VMWare environments. While it duplicates some of the functionality of the Virtual Center application shipped with VMware ESX 3.x performs, it still provides an often deep view into the inner workings of a distributed VMWare installation.

Hyperic HQ is an open source monitoring and management application that's designed to control VMware's ESX servers and the virtual machines hosted on them. The product can be used with VM-hosted applications and operating systems, or on individual operating system or application components; it’s one of the most mature open source projects we’ve tested.

Overall, we have to say that HHQ is a really strong "meta" console for VMware environments. While it duplicates some of the functionality of the Virtual Center application shipped with VMware ESX 3.x, it still provides an often deep view into the inner workings of a distributed VMware installation.

HHQ correlates server system and application events with behavior metrics such as CPU cycles required, memory usage and network connections all of which is accessible from a browser-based dashboard console. HHQ let us see manageable characteristics of VMware hosts, hosted operating systems and compatible applications, providing a long list of monitorable data with options on how to act on that data. What’s left to the administrator is the task of correlating what, of the many useful (and useless) data points, with what actions to take and what to do based on the results of these actions.

HHQ is like a construction set – much like the tools provided in the SNMP RMON monitoring/action management scheme. If you know which points to connect together, you can build a useful management platform.

Net results

And users and groups can be created within HHQ to establish administrative roles for viewing HHQ data and creating, modifying, deleting, alerting and controlling monitored parameters.

The forensic information provided by HHQ for events leading up to a VM crash is good. HHQ offers both RSS feeds that toggled on when monitored server conditions change and has generous reports that annotate history as well as chart things such as uptime and percentages of alarm conditions.

The list of VM management features HHQ doesn’t tackle includes not having any knowledge of VM images or snapshots and therefore it can’t authenticate or verify them. It doesn’t know about users or groups on targeted VM platforms and we were somewhat shocked that Active Directory logon failures weren’t tracked in its Active Directory module as potential security risks.

At its core, HHQ is a JBoss-based client/server application. It can be accessed by any modern browser by any user with the proper administrative rights. It also uses screen space efficiently and offers a command map to show a history of the navigation steps used to arrive at the ‘current’ screen.

Screen shot of Hyperic's HQ Management Console

The HHQ server can run on a huge variety of operating systems, and we suggest that any stable spare server would do the trick. We ran it on Windows 2003 Enterprise server. The HHQ system launches a server daemon, which in turn, does a network autodiscovery routine that searches the network and maps what it finds in terms of host operating systems and the platforms they are running on.

An HHQ agent (the client part of this equation) is installed onto any VMware ESX 3 server and its guest VMs for deep monitoring purposes. Missing are agents for Xen derivatives and Microsoft’s Virtual Server 2005. The product also has what it calls resource agents and services agents. The former tap into resources such as Windows 2003 Servers (and its Active Directory) Internet Information Server, Linux/CentOS and Apache 1.3. The latter watch services such as Microsoft Exchange Server 2000 and PostFix — the mail service we use internally. The agents are supplied by an open source ecosystem surrounding the Hyperic development tree; some are more complete than others. For example, the monitored conditions for Apache Web services are rich, but some of the Microsoft applications aren’t as comparatively complete.

HHQ employs a mixture of SNMP-based and resource-specific monitoring services to gather its VM data. For example, the VMware ESX 3.x VM Disk service resource agent tracks disk availability, disk reads and writes, disk reads and writes per minute, bytes read and written, and bytes read and written per minute. The agents are sometimes riddled with comparatively useless monitoring items such as the number of NTLM Authentications Per Minute for the (Microsoft) Active Directory Agent, but other applications suffer from this problem as well.

Once you’ve got all the agents up and running, the HHQ Dashboard gives a quick visual indication of grouped object and/or individual resource health. HHQ also supports RSS feeds, so that through a simple news reader can server up the messages from HHQ regarding alerts, inventory changes, control actions undertaken and groups of ‘problem’ resources. The Dashboard browser-based console was very useful overall, but its operation took some getting used to as it’s not intuitive and requires use of the user manual for a full understanding.

There is a command-line interface that supports numerous HHQ actions, ranging from listing, controlling and installing agents, running inventory scans, checking action queues and many of the other functions available in the dashboard. These items can be scripted together to perform actions sequentially, rather than the manual methods offered by the HHQ Dashboard.

HHQ moves into the management realm through the grouping of these objects, whereby if something like aggregate bytes written exceeds a threshold, an alarm condition is created. Resources can be restarted after a failure (if the managed resource object permits this) when a condition that’s being monitored triggers the action.

The devil of HHQ lies in its details. The product provides a huge amount of information and all must be cobbled together into an enterprise monitoring agenda. As an example, one could monitor VMware for excessive CPU consumption, and tie that with watching an Apache agent for too many 404 (Web page unavailable messages) that might indicate that a Web service has collapsed. When it’s judged to be collapsed in this way, you can set HHQ to restart the Apache daemon.

One alarming issue is that there is little security applied communication between the agents and the HHQ daemon by default. There’s no SSL, SSH or certificate-based communications between agents, the daemon or the viewer of the HHQ console. The controls used to effect actions on items can be fingerprinted, and therefore spoofed.

Hyperic’s support comes from its user community. Its success rides on the ability for the contributions of its communities to continue to move with the changing times of VMs. As long as the community is healthy, so will Hyperic.

(Compare physical and virtual server management tools in our revamped Server Management Buyer's Guide.)

Henderson is principal researcher and Dvorak is a researcher for ExtremeLabs in Indianapolis. They can be reached at thenderson@extremelabs.com.

NW Lab Alliance

Henderson is also a member of the Network World Lab Alliance, a cooperative of the premier reviewers in the network industry each bringing to bear years of practical experience on every review. For more Lab Alliance information, including what it takes to become a member, go to www.networkworld.com/alliance.

Learn more about this topic

 
Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2007 IDG Communications, Inc.

IT Salary Survey: The results are in