The cost of mold in the cloud

How can you know if the contents of your cloud aren't becoming dangerous?

Cloud computing security lock.
Credit: Thinkstock

Your cloud provider hosts a lot of images, if they’re even mildly successful. There is no legal onus to keep images patched/fixed/rev’d.  Our industry provides no inherent mechanism to provide an image Good Until Date. The potential poisoning of moldy cloud images scares the hell out of me. No two OS vendors do patch/fix/revision control quite the same way.

I’ve heard various statistics bandied about that claim that the top five cloud providers may host more than 300 million images at any particular time of day. These images are getting moldy. Ever see any of the big cloud hosting companies put a freshness date on their stuff? Me neither.

Some of these images (and internal binaries) are containers or VMs. Their lifecycle might be measured in hours, but it could be in days, weeks, or even years. None of them are guaranteed to be up-to-the-minute patched/fixed/rev’d. Inside of the images, varying degrees of binaries that are running are also at a similar unknown patch level. Black mold is possible. Think: the state of Thanksgiving leftovers remaining in your fridge. 

Add to the unknown patch levels of the running instances and their OS and app patch levels: innate binaries, also of an unknown patch level. Turning on automatic updates is a good start, of course, but until you’ve ensured the levels in the instances, the apps, and binaries-in-waiting, you can't be sure. Which of perhaps 43,000 moldy binaries is waiting inside your image to be turned into digital botulism? 

Because there is no industrial freshness date on images, we’re left to our own devices. Enter communications infrastructures that use in-band query methods, called varyingly data buses/dbus, communications buses/cbuses, etc. A communications pipeline is connected to an image, container, VM, whatever. The pipeline typically connects through port 443 and uses one of perhaps many ssh-like tools to talk with a daemon/system process that, in turn, does things. In my current discussion, one of the important things it does is perform queries, and where decent logic is present, perform actions based on these queries.

These communications bus tools tickle images and binaries. The giggles returned are qualities, like revision numbers of binaries or conf files, or perhaps a grep of the contents of configuration files, and so forth. Connect many images together into cohesive systems, as is the norm for things like webservers, database apps, and so forth, and after a query one gains a fair (yet not necessarily comprehensive) inventory base of ostensible patch/fix/rev levels of images and their components.

Minor fortunes are made with internal query tools, like puppet, chef, and more. These buses can query a fleet (yes, a word that’s the name of another product, but we’ll use it as an aggregation) and render patch/version/fix levels, looking for aberrations, or the presence of unknown or WTFware.

Between network management software, syslog tools, and periodic direct inventory queries, the state of a diverse fleet of cloud apps and jobs ought to be known, right?

The problem is that using these platforms is somewhat OS-specific, and up to an IT administrator with both the skills, the tools, and the time to evaluate the results of returned data to ensure that these items are safe throughout the lifecycle of cloud use. It’s seemingly a basic skill and job, but there’s a huge problem: inconvenience.

Because of the not-invented-here problem, each OS vendor—who often also controls the patch/fix/version controls—does this job differently. There are rare common denominators in traditional data processing to help render common information, like file creation dates, a hash code like an MD5 for binaries, and other metadata information as regards to the qualities of executables, libraries, configuration files, and even identifying binary/executable headers with certificate-like qualities.

But no one does this the same way twice. I watched with sad amusement when Microsoft tried to get device driver writers to enter their free program and thus help prevent bad driver mixtures from crippling both server and civilian-based systems. Windows would throw up huge cautions—uh oh, we can’t really tell if this is good stuff or not—and would cause endless confusion for IT black belts and grandmothers alike.

The sense of the company store distribution method is now upon us, and the ostensibly desirable place to get everything is the OS vendor. Apple wants to take you to the Apple Store, Microsoft, VMware, the list is increasing by the week—whenever a vendor can point you to one of their sales outfits or patch/fix sources, they will.

My single desire is to have an agreed-upon metadata format that comprises a useful freshness date methodology across Windows, MacOS, Linux, Free/NetBSD, etc., in what might be an industry first—something not-invented-here.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.