There are no Magic Beans for Data Quality

The CIO put Jack in charge of an enterprise initiative with a sizable budget to spend on improving data quality.

Jack was sent to a leading industry conference to evaluate data quality vendors.  While his flight was delayed, Jack was passing the time in the airport bar when he was approached by Machiavelli, a salesperson from a data quality software company called Magic Beans.

Machiavelli told Jack that he didn't need to go to the conference to evaluate vendors.  Instead, Jack could simply trade his entire budget for a unlimited license of Magic Beans. 

Machiavelli assured Jack that Magic Beans had the following features:

  • Simple to install
  • Remarkably intuitive user interface
  • Processes a gazillion records per nanosecond
  • Clairvoyantly detects and corrects existing data quality problems
  • Prevents all future data quality problems from happening

Jack agreed to the trade and went back to the office with Magic Beans.

Eighteen months later, Jack and the CIO carpooled to Washington, D.C. to ask Congress for a sizable bailout.

What is the moral of this story? 

(Other than never trust a salesperson named Machiavelli.)

There are many data quality vendors to choose from and all of them offer viable solutions driven by impressive technology.

However, technology sometimes carries with it a dangerous conceit – that what works in the laboratory and the engineering department will work in the boardroom and the accounting department, that what is true for the mathematician and the computer scientist will be true for the business analyst and the data steward.

My point is neither to discourage the purchase of data quality software, nor to try to convince you which vendor I think provides the superior solution – especially since these types of opinions are usually biased by the practical limits of your personal experience and motivated by the kind folks who are currently paying your salary.

And I am certainly not a Luddite opposed to the use of technology.  I am first, foremost, and proudly a techno-geek of the highest order.  However, I have seen too many projects fail when a solution to data quality problems was attempted by “throwing technology at it.”  I have seen beautifully architected, wonderfully coded, elegantly implemented technical solutions result in complete and utter failure.  These projects failed neither because using technology was the wrong approach nor because the wrong data quality software was selected.

Data quality solutions require a holistic approach involving people, methodology, and technology.

People

Sometimes, people doubt that data quality problems could be prevalent in their systems.  This “data denial” is not necessarily a matter of blissful ignorance, but is often a natural self-defense mechanism from the data owners on the business side and/or the process owners on the technical side.  No one likes to feel blamed for causing or failing to fix the data quality problems.  This is one of the many human dynamics that is missing from the relative clean room of the laboratory where the technology was developed.  You must consider the human factor because it will be the people involved in the project, and not the technology itself, that will truly make the project successful.

 

Methodology

Data characteristics and their associated quality challenges are unique from company to company.  Data quality can be defined differently by different functional areas within the same company.  Business rules can change from project to project.  Decision makers on the same project can have widely varying perspectives.  All of this points to the need for having an effective methodology, which will help you maximize the time and effort as well as the subsequent return on whatever technology you invest in.

 

Technology

I have used software from most of the Gartner Data Quality Magic Quadrant and many of the so-called niche vendors.  So I speak from experience when I say that all data quality vendors have viable solutions driven by impressive technology.  However, don't let the salesperson “blind you with science” to have unrealistic expectations of the software.  I am not trying to accuse all salespeople of Machiavellian machinations (even though we have all encountered a few who would shamelessly sell their mother’s soul to meet their quota).     

 

Conclusion

Just like any complex problem, there is no fast and easy solution.  Although incredible advancements in technology continue, there are no Magic Beans for Data Quality.

And there never will be.

An organization's data quality initiative can only be successful when people take on the challenge united by collaboration, guided by an effective methodology, and of course, implemented with amazing technology.