Why speeds and feeds don’t fix your data management problems

Beyond speeds, feeds, and the cost of storage capacity, admins are now beginning to look at the missing link in their infrastructure – information about how their data is actually being used.

2 information systems
Thinkstock

For a very long time, IT professionals have made storage investments based on a few key metrics – how fast data can be written to a storage media, how fast it can be read back when an application needs that information, and of course, the reliability and cost of that system.

The critical importance of storage performance led us all to fixate on latency and how to minimize it through intelligent architectures and new technologies.

Given the popularity of flash memory in storage, the significance of latency is not about to fade away, but a number of other metrics are rapidly rising in importance to IT teams. Yes, cost has always been a factor in choosing a storage investment, but with cloud and object storage gaining popularity, the price of storage per gigabyte is more than a function of speed and capacity; it is also the opportunity cost of having to power and manage that resource.  When evaluating whether to archive data on premises, or to send it offsite, IT professionals are now looking at a much wider definition of overall cost.

Beyond costs and speeds and feeds though, IT innovators are beginning to discover a critical new set of attributes they can use to manage their data more strategically than ever before.

Today, architecting application infrastructure based on storage attributes alone is falling short. IT leaders are recognizing the importance of having intelligence about their data, so that they can create informed strategies rather than blindly throwing speeds and feeds into their infrastructure. Metadata analytics can now tell us the importance of a file by analyzing when it was last accessed, when it was last changed, who accessed it and more – giving IT the intelligence needed to determine strategies for meeting data requirements without overprovisioning and overspending.

The great communication breakdown

Applications and storage have long been oblivious to each other’s capabilities and needs. The majority, if not nearly all, of today’s enterprise applications do not know the attributes of the storage where its data resides. Applications cannot tell if the storage is fast or slow, premium or low cost. They are also unaware of storage’s proximity, and factors like network congestion between storage and the application server, which can significantly impact latency.

Conversely, storage does not know what data is the most important to an application. It only knows what was recently accessed, and uses that information to place data in caching tiers, which will increase performance if that same data happens to be accessed again. Some enterprises try to address these issues with caching tiers, but unfortunately caches do not have the intelligence needed to reserve capacity for mission-critical applications. Unintended data eviction from a cache can cause serious performance inconsistency or require ever more cache.

The enlightened age of data management

With only information about storage performance and costs – and none about data activity – IT is stuck in the dark with only half the insight needed for strategic data management. Metadata fills in the rest of the picture, and metadata management solutions are now bringing an end to the era of educated guesses and costly over-provisioning of resources.

Since metadata-intelligence software enables admins to see when files have been last opened, how often, by whom, when they were modified and more, admins can now manage their storage resources more efficiently. Cold data can move to archival tiers automatically and without disruption, while hot data is placed on storage systems that meet business needs for performance and price. It’s even possible to create policies that make use of this data and to automate data movement through scripts or software, freeing IT to focus on more strategic tasks.

Give applications intelligent-storage awareness

With real-time metadata intelligence, IT can understand how applications experience storage (for example latency, IOPS and bandwidth). This is possible thanks to open standard data and I/O access-based protocol stacks running native in the client. The recent release of NFS 4.2 includes enhancements to the Network File System (NFS) Flex File layout that allow clients to provide statistics on how data is being used and on the performance provided by the storage resources serving the data. With NFS 4.2, data can even be moved while it is live – without application interruption.

These advanced features are already being rapidly adopted, as the most recent release of Red Hat Enterprise Linux 7.3 features Flex Files support to simplify management of NFS clusters.

Beyond speeds, feeds and the cost of storage capacity, admins are now beginning to look at the missing link in their infrastructure – information about how their data is actually being used. It seems hard to believe that we are only now gaining this critical asset in data management. With IT leaders already rushing to see what they’ve been missing, it may be time to examine how you can use metadata insights to enlighten and automate data management at your enterprise.

This article is published as part of the IDG Contributor Network. Want to Join?

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Must read: 10 new UI features coming to Windows 10