Americas

  • United States

Why speeds and feeds don’t fix your data management problems

Opinion
Nov 07, 20175 mins
Database AdministrationNetworkingSystem Management

Beyond speeds, feeds, and the cost of storage capacity, admins are now beginning to look at the missing link in their infrastructure – information about how their data is actually being used.

2 information systems
Credit: Thinkstock

For a very long time, IT professionals have made storage investments based on a few key metrics – how fast data can be written to a storage media, how fast it can be read back when an application needs that information, and of course, the reliability and cost of that system.

The critical importance of storage performance led us all to fixate on latency and how to minimize it through intelligent architectures and new technologies.

Given the popularity of flash memory in storage, the significance of latency is not about to fade away, but a number of other metrics are rapidly rising in importance to IT teams. Yes, cost has always been a factor in choosing a storage investment, but with cloud and object storage gaining popularity, the price of storage per gigabyte is more than a function of speed and capacity; it is also the opportunity cost of having to power and manage that resource.  When evaluating whether to archive data on premises, or to send it offsite, IT professionals are now looking at a much wider definition of overall cost.

Beyond costs and speeds and feeds though, IT innovators are beginning to discover a critical new set of attributes they can use to manage their data more strategically than ever before.

Today, architecting application infrastructure based on storage attributes alone is falling short. IT leaders are recognizing the importance of having intelligence about their data, so that they can create informed strategies rather than blindly throwing speeds and feeds into their infrastructure. Metadata analytics can now tell us the importance of a file by analyzing when it was last accessed, when it was last changed, who accessed it and more – giving IT the intelligence needed to determine strategies for meeting data requirements without overprovisioning and overspending.

The great communication breakdown

Applications and storage have long been oblivious to each other’s capabilities and needs. The majority, if not nearly all, of today’s enterprise applications do not know the attributes of the storage where its data resides. Applications cannot tell if the storage is fast or slow, premium or low cost. They are also unaware of storage’s proximity, and factors like network congestion between storage and the application server, which can significantly impact latency.

Conversely, storage does not know what data is the most important to an application. It only knows what was recently accessed, and uses that information to place data in caching tiers, which will increase performance if that same data happens to be accessed again. Some enterprises try to address these issues with caching tiers, but unfortunately caches do not have the intelligence needed to reserve capacity for mission-critical applications. Unintended data eviction from a cache can cause serious performance inconsistency or require ever more cache.

The enlightened age of data management

With only information about storage performance and costs – and none about data activity – IT is stuck in the dark with only half the insight needed for strategic data management. Metadata fills in the rest of the picture, and metadata management solutions are now bringing an end to the era of educated guesses and costly over-provisioning of resources.

Since metadata-intelligence software enables admins to see when files have been last opened, how often, by whom, when they were modified and more, admins can now manage their storage resources more efficiently. Cold data can move to archival tiers automatically and without disruption, while hot data is placed on storage systems that meet business needs for performance and price. It’s even possible to create policies that make use of this data and to automate data movement through scripts or software, freeing IT to focus on more strategic tasks.

Give applications intelligent-storage awareness

With real-time metadata intelligence, IT can understand how applications experience storage (for example latency, IOPS and bandwidth). This is possible thanks to open standard data and I/O access-based protocol stacks running native in the client. The recent release of NFS 4.2 includes enhancements to the Network File System (NFS) Flex File layout that allow clients to provide statistics on how data is being used and on the performance provided by the storage resources serving the data. With NFS 4.2, data can even be moved while it is live – without application interruption.

These advanced features are already being rapidly adopted, as the most recent release of Red Hat Enterprise Linux 7.3 features Flex Files support to simplify management of NFS clusters.

Beyond speeds, feeds and the cost of storage capacity, admins are now beginning to look at the missing link in their infrastructure – information about how their data is actually being used. It seems hard to believe that we are only now gaining this critical asset in data management. With IT leaders already rushing to see what they’ve been missing, it may be time to examine how you can use metadata insights to enlighten and automate data management at your enterprise.

lancesmith

Primary Data CEO Lance Smith is a strategic industry visionary who has architected and executed growth strategies for disruptive technologies throughout his career. Prior to Primary Data, Lance served as Senior Vice President and General Manager of SanDisk Corporation IO Memory Solutions, following the SanDisk acquisition of Fusion-io in 2014. He served as Chief Operating Officer of Fusion-io from April 2010.

Lance has held senior executive positions of companies that transacted for billions of dollars, holds patents in microprocessor bus architectures, and received a Bachelor of Science degree in Electrical Engineering from the Santa Clara University.

The opinions expressed in this blog are those of Lance L. Smith and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.