Traditional Approach to Data Management Only Treats the Symptoms


Enterprise Data Governance Blog Series: No. 2

Winston Chen

In my last blog, I discussed that although we’ve thrown a huge amount of money to solve data problems, the result is unsatisfactory. For poor data quality, we identified the root cause: the lack of transparency and accountability between providers and consumers of data.

In most organizations, because the relationships and rules of engagement between data providers and consumers are not transparent, data consumers naturally assume that the wizardry of IT is responsible for data. When data problems arise, IT gets the blame: IT becomes the de facto data owner. But IT typically doesn’t have the authority to address the root cause by telling data providers to bear the cost of good data for the benefit of the entire organization. So IT has to solve the problem in some other way.

Rather than focusing data provider and consumer relationships, the traditional approach fixes the bad data that’s already been created with many technical tools and build central repositories to hold the cleansed data. We do this using metadata tools to figure out where data lives and what it looks like. We use data quality tools to profile and cleanse bad data. We use data integration tools to move it around.  And, we build enterprise data warehouses and transactional master data hubs to store the end results. This approach suffers from three problems:

First, data cleansing and repository building are almost always carried out on a project by project basis. These projects are expensive. And, even if the project is successful, and bad data is transformed to good data, the repository immediately starts to degrade. More and more newly created bad data will creep into the system. And the data already cleansed start getting stale. The real world they represent changes. Data has a shelf-life and needs constant care and feeding. Without addressing how bad data is created, these solutions are costly and unsustainable.

Second, it’s difficult to get the business side fully committed to and involved in these projects. Without a change of mindset, data continues to be seen as IT’s responsibility. And to exacerbate the problem, the software tools used were meant for an IT user base, which leaves the business without a way to directly participate in the process. Without full and sustained business engagement, these projects often do not yield anticipated benefits.

Last, it is very, very hard to fix bad data using technical tools alone. A computer algorithm for data cleansing, no matter how cleverly constructed, can only address a very small subset of data problems. Bad data is often rooted in human behavior, which hard-coded validation rules cannot fix. In the billing address example, what’s preventing the sales rep from simply copying the shipping address as the billing address without verifying it with the customer? A computer algorithm would not even be able to detect that there is a problem, let alone fix it. Properly cleaning up customer data requires the knowledge of each individual customer. And IT, no matter how business oriented, will never have that kind of knowledge.

Given the magnitude of the challenges and constraints, IT has done an extraordinary job in most organizations. Without taking these actions, data problems today would be far worse. However, by and large these efforts treat the symptoms, rather than addressing the root cause. Strictly speaking, these projects represent a cost of bad data in addition to degradation of business performance. The bottom line is data content shouldn’t be IT’s responsibility. With data volume and complexity exploding, the treadmill is spinning faster than the traditional approache’s ability to keep up.

_________________________________________________________

This blog is part 2 of a multi-part series of blogs on the topic of Enterprise Data Governance. To read other posts from this series, please see below.

Part 1: What’s the Root Cause of Bad Data?

Part 3: What do Environmental Policy and Data Governance Have in Common?

Part 4: Data Policies are the Instruments of Data Governance

Part 5: Data Governance Should be Formalized as a Business Process

Part 6: Send in the Yellow Jerseys: Organizing for Data Governance

Part 7: How to Set the Right Initial Scope for Data Governance?

Part 8: How to Build a Business Case for Data Governance?

Tags: , ,

Trackbacks/Pingbacks

  1. Send in the Yellow Jerseys: Organizing for Data Governance | Kalido Conversations - May 20, 2010

    [...] Part 2: Traditional Approach to Data Management Only Treats the Symptoms [...]

  2. What’s the Root Cause of Bad Data? | Kalido Conversations - May 20, 2010

    [...] Part 2: Traditional Approach to Data Management Only Treats the Symptoms [...]

  3. What do Environmental Policy and Data Governance Have in Common? | Kalido Conversations - May 20, 2010

    [...] Part 2: Traditional Approach to Data Management Only Treats the Symptoms [...]

  4. How to Set the Right Initial Scope for Data Governance? | Kalido Conversations - May 27, 2010

    [...] Part 2: Traditional Approach to Data Management Only Treats the Symptoms [...]

  5. Data Governance Should be Formalized as a Business Process | Kalido Conversations - May 27, 2010

    [...] Part 2: Traditional Approach to Data Management Only Treats the Symptoms [...]

  6. Data Policies are the Instruments of Data Governance | Kalido Conversations - May 27, 2010

    [...] Part 2: Traditional Approach to Data Management Only Treats the Symptoms [...]

  7. Building a Business Case for Data Governance | Kalido Conversations - June 2, 2010

    [...] [...]

Leave a Reply