Americas

  • United States

Data quality for dummies, Part 1

Opinion
Dec 06, 20053 mins
Data Center

* The challenge of maintaining data quality

Garbage in, garbage out: That’s an adage dating back to the dark ages of computing. It can also serve as a call to arms for any organization seeking to make effective use of organizational data to improve customer service, focus product and service development, and respond in real time to rapid changes in a business environment.

The challenge, quite simply, is that while companies will invest tens of millions of dollars in information initiatives such as CRM, supply-chain management (SCM), ERP, and business analytics (BA), they almost universally neglect the one thing that’s a cornerstone to all these initiatives: the data.

In a recent Nemertes benchmark on the topic, we found that data-quality management (DQM) is the single biggest challenge within the broad overall umbrella of “information stewardship” – the art and science of ensuring that data within an enterprise is adequately stored, secured and backed up, within compliance requirements, and accurate. The bottom line? So long as companies fail to focus explicitly on DQM, their expensive and elaborate information initiatives will fall short.

What should organizations do? First, make sure there’s an appropriate process in place to keep data accurate from the moment it enters the system until the instant it’s retired or deleted. Many enterprise applications, including ERP, SCM, and CRM software, have built-in tools to help catch data-entry errors in real time. Third-party software applications also can help define data-entry rules and avoid mistakes (check out products from DataFlux, Firstlogic, IBM, Similarity, and others). For errors that aren’t caught at the point of entry, data profiling, or discovery, is key. Basic accuracy can be mapped against known demographic databases, for instance, so that general personal information (addresses, phone numbers) is checked and updated. Similar applications can look for anomalies in other, less “common” data types, such as product information and financials, and match them against the policies and information supplied by the company.

But data quality goes beyond basic accuracy. Duplicate data (or triplicate, or beyond) can clog databases and applications, slow performance, result in inaccurate reporting, and create a marketer’s nightmare, especially if the dupes don’t match exactly. This is a common issue for companies that merge with or acquire other companies and their respective applications and databases. It’s also an issue for companies as they integrate their existing systems – tying, say, their CRM and ERP data together.

Meta-data management, or ensuring that the data within enterprise systems conforms to the rules and policies set by the database managers when they built the system, is also important. Finally, companies with a meaningful DQM plan will perform ongoing data monitoring, to continually check for and correct errors. This can prevent data-entry mistakes from dirtying the database long-term, and it can also help keep old data up to date as people move or their situation changes, and their demographic information follows suit, or as product prices and specs change with the market.

Sound daunting? It needn’t be. Stay tuned for some insights in our next newsletter on how to put these good ideas into practice.