Skip Links

Best practices for benchmarking SAN performance

By Craig Foster, Network World
December 21, 2009 12:04 AM ET
This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.
  • Print

Storage-area network complexity can mask what might seem to be relatively benign issues that have the potential to build up and cause an outage or brownout. To identify trouble early, you need to create a SAN performance benchmark, an essential first step to setting up metrics to gauge infrastructure performance.

The key is to establish the metrics in advance. Most companies wait until they have a problem before trying to truly understand baseline performance. Ironically that is the worst time to look because: a) what is found is often overwhelming, b) often multiple issues appear to be the cause and it can be difficult to know where to start, and c) many performance optimization opportunities are overlooked.

Here are best practices for benchmarking SAN performance:

1. Baseline when the SAN is healthy. The best time to evaluate an environment is when everything is healthy and before a cost-saving or performance-enhancing project is implemented. This provides a metric to compare the "problem" state to the baseline, making it immediately obvious where the problem resides.

Ideally, a company should be proactive with the initial baseline and address the issues that are present. Eliminating existing issues helps reduce the number of problems that can together cause a brownout. Optimization savings can be well planned and measured by comparing both consolidation effectiveness and user impact

A good baseline will often reveal over-provisioned infrastructures, ineffective use of tiers, multi-path issues, uneven load distribution, physical layer problems, minor device incompatibilities, improper configurations (zoning, I/O size request, queue depths), out of control applications, unnecessary load or intermittent performance issues.

2. Measure what matters. The most important goal for an application user is to see their actions complete successfully and accurately in a timely fashion. There are two secondary goals for the IT organization: how to resolve user issues, and how to ensure the solutions use only the resources necessary.

Companies often rely on the most readily available metrics rather than the most useful. One such metric is I/Os per second. This metric only addresses two secondary measures: is the I/O causing a problem, and how optimal is it? It does not get to the heart of the most important questions: how quickly are things getting done, and are they all successful?

Rather than looking at I/O, for effective monitoring you need to consider:

* Minimum, maximum and average for Read/Write/Other Exchange Completion time (ECT) (9 metrics) for every host bus adaptor (HBA), storage port and logical unit number (LUN).

* Minimum, maximum and average read command to first data for every HBA, storage port and LUN.

* Minimum, maximum and average pending exchanges (queue depth) for every HBA, storage port and LUN.

* Read/write/other I/O size for every HBA, storage port and LUN.

Another common mistake is to give a metric more credit than it deserves. For example, relying on a server response time (either from the operating system or an application on the server) to determine the health of the rest of the infrastructure.

  • Print
What is Tech Briefcase?
TechBriefcase is a new, free service where IT Professionals can Search, Store and Share IT white papers and content like this. Learn more
Bookmark content
Speed up your research efforts with content across the web.
Search and Store
Find the white papers you need. Create folders for any topic.
View Anywhere
Open your briefcase on your iPhone, tablet or desktop. Share with colleagues.
Don't have an account yet?

Videos

rssRss Feed