• United States

End-to-end SLAs require cascading

Sep 08, 20043 mins
Enterprise Applications

* Cascading SLAs needed to stretch agreements across net

Last week, I wrote about raising the bar on service-level management by creating service-level agreements that guarantee more than specific, elemental performance and availability – SLAs based on end-user quality of experience, or QoE. A number of you wrote back with comments that ranged from “I don’t see how the carriers could ever do that” to “What a great idea – but how do you do that?”

One reader wrote:

“I read your article on raising the bar on SLM, and while I agree in theory with you, practically I don’t see how any single vendor can offer such guarantees. There are so many varied components from the end user’s brain to the application on a distant server, one would be hard pressed to find a single vendor willing to commit to the performance level of every component.”

Clearly, one of the greatest challenges in providing QoE SLAs is that no single service provider has control over all aspects of the infrastructure – especially in today’s distributed, Internet-enabled environment. Even in our case, where our provider not only hosts our servers and applications in its data center but also serves as our ISP, they do not have control over the “last mile”; many of our users are not physically located in our headquarters building and access our infrastructure over the Internet, which is clearly outside the scope of what any provider could guarantee (and QoE SLAs must be explicit about not covering last-mile issues).

That said, I think that the concept of “cascading SLAs” is one way to help provide QoE-based SLAs. In this scenario, a “primary service provider” (like our hosted Citrix provider) depends on downstream providers (like ISPs and carriers), and it should have separate SLAs with those providers. That way, if an outage occurs that causes the primary provider to suffer a penalty that is outside of its control (e.g., a link goes down that is provided by a carrier), then the penalties that the provider would pay to the end consumer would be, theoretically, mitigated by remunerations received from the downstream provider.

Another key issue is that of monitoring end-user QoE so that the service provider knows when it is potentially in breach of an SLA. This must be done in baby steps – for instance, the first step for our provider is to start monitoring more than simple server up/down status from inside a firewall. For example, this morning I could not access our Citrix server from the Internet. I could get to it through a VPN connection, so the server was up, but the help desk had no idea that there was a problem (because you could still ping the server from inside of the firewall).

Adding outside-of-the-firewall availability monitoring is the first step that must be taken, and a number of vendors can provide this functionality today, including Keynote Systems and AlertSite. These vendors provide server monitoring from outside of the firewall from a number of points of presence around the world, which are great tools for 1) making sure your service is available from outside of the firewall, and 2) eliminating last-mile questions from the equation.

I plan to continue covering this subject in subsequent articles, and as always I welcome your ideas, suggestions and comments on the subject of outsourcing; my e-mail address is below.