Systems (and operations) management is a journey, not a destination. What does this mean?
First and foremost, its not something achieved at a point in time. There isn't a big "do it" button (or function key) that magically gets you at systems / operations nirvana.
Not only that, its not a matter of getting there and your job is done, its a continual process. Quality and Productivity is a combination of technology (software you buy), people (the individuals that implement and maintain that software), and the process in place from getting to point A to B to ... Z.
Systems management touches nearly every area of your IT operations, and as those operations grow in size, scope, complexity, and business impact, what is common to all phases is efficiency and automation, which are based on repeatable processes that confirm to industry best practices.
If we're looking at a Microsoft-oriented shop, how do we accomplish this? The strategy Microsoft uses incorporates a number of areas:
The important thing to remember is systems management is a process, incorporating technology (tools) and people to actrive quality and productivity - without these you will not accomplish what you hope to. You need to develop a roadmap for your own organization, get signoff, and implement it using the technology, people, and process.
(For an overview of the System Center family, check out http://www.networkworld.com/community/node/21315.)
Kerrie Meyler, a Microsoft MOM MVP, is an independent consultant and trainer with more than 15 years of Information Technology experience. A previous senior technology specialist at Microsoft, she focused on infrastructure and management solutions, presenting at numerous product launches. More recently, she presented on Operations Manager 2007 and gave several podcasts at TechEd 2007.
Kerrie has worked with Microsoft Learning to develop Microsoft Official Curriculum (MOC) for several courses, including the Implementing Microsoft Operations Manager 2000 course, and did the beta teach for that course.
Kerrie is the lead author of Microsoft Operations Manager 2005 Unleashed and Microsoft System Center Operations Manager 2007 Unleashed
Check out an excerpt from System Center Operations Manager 2007 Unleashed, Chapter 3: Looking Inside OpsMgr.
The opinions expressed in this Weblog are those of the writer and may not represent the opinions of Network World.
|
|
What about the applications and business services?
While I completely agree that systems management is a continuously evolving process, what many IT teams don't handle very well is performance management of complex, multi-tier applications and business services. While the individual silos (e.g., web tier, app server tier, db tier, etc.) are managed well, it is the complex interdependencies between the silos that are often the cause of brownouts and outages in mission critical services. Even in Microsoft-oriented shops this problem remains. What most IT teams are missing is holistic real-time analysis across the silos that can pinpoint the precursors to performance problems. Instead, siloed monitoring tools, which rely on static threholds, are the only view the team has into the performance of the service. Often the siloed solutions don't indicate a problem and thus troubleshooting results in fingerpointing and wasted time.
Solutions now exist that allow real time analysis of performance data across all silos of a service. These solutions can learn the normal behavior of the service across all tiers, establish the "noise floor" of expected abnormalities and only alert when the service as a whole is experiencing abnormal behavior that exceeds the noise floor. These solutions can even predict that the noise floor will be exceeded allowing IT Operations to take a proactive approach to performance management. As IT teams continue to evolve their systems management practices, they need to take a more holistic and proactive approach around the mission critical services that drive their businesses.
Those are people, process, and technology also!
Good observation. What IT teams need to better handle performance management of complex applications is knowledge of how those applications and services work. Its not enough to monitor Server B, what needs to be monitored is Application A which can have components on 5 servers (application, database, web, file servers, and such), understanding that each of those components is critical to the health of the application. For example, its not enough to know if your database server is down, but rather what applications are dependant on that database server.
System Center Operations Manager 2007 has a tool that can help with this, its called the Distributed Application Designer, or DAD for short. DAD gives you the capability to monitor distributed applications by defining the silos used in a business application and the interdependencies between the silos. However, that's just the tool/technology part of it. Without the people and process components, the tool is just shelfware.
CIOs and Application Development managers need to work with the IT monitoring folks to help define the processes that an application depends on so that appropriate monitoring can take place. Without input from the application owners, a monitoring tool cannot be very effective.
Kerrie Meyler
Post new comment