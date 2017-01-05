Dynamic resource management for cloud computing is at a critical crossroad. The ultimate objective when provisioning software-defined infrastructure, synchronizing inter-cloud resources, or allocating network bandwidth is allowing applications to successfully execute on demand without concern for capacity. While these approaches are effective in supplying applications with additional capacity on demand, the downside is that application performance may not be optimized in the process.

Cloud applications and services have become so complex that the runtime synchronization of resources required to support them drags down overall performance and leaves capacity unused. To tap this unused capacity, and deliver the performance expected, we need to enhance resource management with something like intelligent resource execution.

Let me explain.

Auto-scaling VMs and private virtual clouds is arguably the most critical type of dynamic resource management. This scaling capability now allows cloud applications to be designed and operated such that if more storage capacity is needed, that capacity and the resolution of its network address will be seamlessly provided. Further, these applications depend on an increasingly complex and automated set of cloud-native services to manage things such as data replication and synchronization, load distribution and VM failover.

+ Also on Network World: Apps be nimble, apps be quick +

Auto-scaling techniques work well for provisioning additional capacity, but they do not address application performance issues. I’m not referring to whether storage capacity is provisioned with the right cost performance metrics or if networks have sufficient raw bandwidth to handle surging inter-VM or inter-cloud traffic. I am referring to a number of lesser-known, but increasingly important, issues around how modern-day cloud applications can take advantage of parallel execution in CPU cores, network and I/O resources.

Today’s applications are developed to be agnostic of cloud infrastructure, and to execute efficiently, the underlying infrastructure must to be autonomic in detecting the necessary SLA an application requires. But how is that done? One commonly used approach is to provide an infrastructure API through which an application can specify the right resource types and SLA. But this goes against the principle of building infrastructure-agnostic cloud applications and is unlikely to work for inter-cloud application transportability due to their fundamental provisioning and SLA enforcement differences.

Additionally, modern cloud applications have embraced multi-threading and vCPU pinning to exploit scaling across multi-core processors. But this is leading to growing performance inefficiencies due to issues such as kernel contention and layers of virtualization, so the most advanced apps are leveraging software-defined virtual devices to cut through the OS kernel to avoid hypervisor and container overhead.

The implications of this technique are that application- and operating system-level resource contention has grown so complex that simply making infrastructure provisioning differentiated by an SLA metric will not work.

Dynamic resource execution

The bigger point is that there is a growing gap between cloud applications and cloud infrastructure, and it is not being addressed by today’s dynamic resource management techniques. We need a way to tell the infrastructure and the OS that an application’s performance is degrading due to resource contention. And then we need to communicate what additional resources should to be provisioned to alleviate the bottlenecks.

Since resource contention is highly correlated to the way an application’s process threads are executed, we can analyze the data path and apply machine learning techniques to identify contention and bottlenecks experienced by critical software threads. Based on that, new resources can be provisioned instantly, likely without the application having to intervene or even be aware of this action. This new approach can enable dynamic resource management tools to go beyond just provisioning. Perhaps this is something we could call dynamic resource execution.

A dynamic resource execution framework must be standards based, leveraging the latest innovations already in place from software-defined networking and software-defined storage. Conceptually, it can be thought of as having a resource abstraction layer followed by a dynamic resource resolution layer. For instance, in Linux applications, sockets are the resource abstraction for a logical network connection from which the kernel or the hypervisor will dynamically resolve the network path and network addresses that need to be translated.

Further, it should be an on-demand resource provisioning framework that is application-specific with per process or per thread granularity. The latter is necessary for modern day cloud applications running over multi-core processors in order to orchestrate per thread resource allocation to distinguish important “elephant” threads from less important “mice” threads, as well as to allocate scarce infrastructure resources for elephant threads that have the highest performance impact.

Fortunately, recent advancements in device virtualization will make it easier to provision resources at the granular level, thanks to their ability to create a large number of virtual devices per physical device.

While the above is quite reflective of where we are in cloud and infrastructure technologies, the ability to extract application contextual intelligence so that the infrastructure can infer application intent remains unresolved.

Closing the gap between applications and infrastructure

Infrastructure and the operating systems that applications run over are not application aware and cannot be made so because both are designed to fairly share scarce resources and treat application processes and threads equally, only differentiated by their priority or I/O states. The insights into where application contention lies and what resources are critical when contention occurs are beyond their scope and design objectives.

What we ultimately need is a framework in which information about the resources an application requires is automatically deduced from process or thread execution. Then by leveraging the software-defined capability of the infrastructure, the information is passed down to the infrastructure so it can properly respond. Dynamic resource execution can do this without any application intervention and without imposing any changes to the cloud-native application development paradigm.

This article is published as part of the IDG Contributor Network. Want to Join?