Mass data fragmentation requires a storage rethink

A fresh approach to secondary storage is required to solve the growing mass data fragmentation (MDF) problem.

Mass data fragmentation requires a storage rethink
MaxiPhoto / Getty Images

Companies are experiencing a growing problem of mass data fragmentation (MDF). Data is siloed and scattered all over the organization — on and off premises — and businesses are unable to use the data strategically.

When data is fragmented, only a small portion of it is available to be analyzed. In my last post, I described MDF as a single trend, but it can occur in a number of ways.

Below are the most common forms of MDF:

  • Fragmentation across IT silos: Secondary IT operations such as backups, file sharing/storage, provisioning for test/development and analytics are typically being done in completely separate silos that don’t share data or resources, with no central visibility or control. This results in overprovisioning/waste, as well as a challenge to meet service-level agreements (SLAs) or availability targets.  
  • Fragmentation within a silo: There are even "silos within silos." Example: backup, where it is not uncommon to have four to five separate backup solutions from different vendors to handle different workloads such as virtual, physical, database, and cloud. On top of that, each solution needs associated target storage, de-dupe appliances, media servers, etc., which propagate the silo problem.
  • Fragmentation due to copies: It’s been estimated that up to 60 percent of secondary data storage is taken up by copies, needlessly taking up space and cost and raising risk. Worse, there is no re-purposing of the data for other use cases, such as test/develpment (where frequent copies of data are made for developers to test or stage their apps) or analytics (where data is copied and centralized in a lake or warehouse to run reports against).
  • Fragmentation across different locations: Today’s distributed, mobile organizations and easy access to cloud services mean there are more options than ever for data to be stored in multiple locations – perhaps without IT’s knowledge or control. And with the advent of edge computing and the Internet of Things (IoT), some data will never move from its edge location but will need to be managed in situ, away from conventional infrastructure and control.
  • Operational fragmentation: The specialized and siloed nature of secondary infrastructure and operations means IT is burdened with extra Opex and organizational overhead just to "keep the lights on," as well as extra cycles for coordination across functions to meet SLAs, recover from failures, manage upgrade cycles, troubleshoot support issues, and so on.

Legacy storage systems perpetuate the problem of MDF

MDF should be considered a serious issue and an inhibitor to competing in the digital transformation era. One of the problems is the majority of the storage industry hasn’t developed a solution. Instead, it has perpetuated the problem with legacy systems that are outdated, flawed, unsustainable, and cost companies more and more with no relief in sight. Incremental improvements won’t solve the problem today or ever. It’s time for a storage rethink that is built from the ground up. This will enable businesses of all sizes to become more data-centric and compete with the likes of Amazon and Google.

How to solve the MDF problem

In particular, secondary storage is where the majority of the problem lies. Re-architecting secondary storage can transform it it from being a liability into a strategic asset. Below are the core components of an MDF solution:

  • Predictive intelligence that analyzes the data and is able to anticipate needs and automate resources.
  • Data and apps that span silos to tap into information that was previously unreachable or largely invisible. This will generate exponential value from a business’s intelligence and can help companies reduce compliance risks by lighting up sensitive data that may span multiple silos.
  • Consolidated solutions for backup, archiving, file sharing, test/development, and analytics on a single software platform, eliminating the need for the complex and limiting legacy infrastructures previously installed.
  • The ability for companies to manage any aspect of their secondary environment through a single, easy-to-use GUI – from setting protection policies and SLAs to managing data center or cloud environments globally, ensuring optimal use of resources and checking for regulatory compliance. No matter how technically advanced a solution is, the full value can’t be achieved without a well-designed interface.
  • The ability to run multiple applications on the same platform to exploit the value of secondary data. This allows companies to become a more effective digital business.
  • An open platform to enable applications to be created either by the customer or by an ecosystem of partners and ISVs in an app marketplace.

The failure to connect information buried within silos has proven to be a costly and dangerous problem for enterprises, and the industry has done little to address it. A rethink of secondary storage that is built on the modernized design principles of hyperconvergence, web-scale, distributed file systems and cloud-fluency is required to meet the challenges of MDF.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:
Now read: Getting grounded in IoT