Data Quality Industry: Problem Solvers or Enablers?
Jim Harris in
Data Quality,
Debates tagged
Best of 2010,
Humor,
Philosophy
Monday, October 25, 2010 at 10:30AM This morning I had the following Twitter conversation with Andy Bitterer of Gartner Research and ANALYSTerical, sparked by my previous post about Data Quality Magic, the one and only source of which I posited comes from the people involved:
What Say You?
Although Andy and I were just joking around, there is some truth beneath these tweets. After all, according to Gartner research, “the market for data quality tools was worth approximately $727 million in software-related revenue as of the end of 2009, and is forecast to experience a compound annual growth rate (CAGR) of 12% during the next five years.”
So I thought I would open this up to a good-natured debate.
Do you think the data quality industry (software vendors, consultants, analysts, and conferences) is working harder to solve the problem of poor data quality or perpetuate the profitability of its continued existence?
All perspectives on this debate are welcome without bias. Therefore, please post a comment below.
(Please Note: Comments advertising your products and services (or bashing your competitors) will NOT be approved.)
Related Posts
Which came first, the Data Quality Tool or the Business Need?
Do you believe in Magic (Quadrants)?
Can Enterprise-Class Solutions Ever Deliver ROI?
The Once and Future Data Quality Expert
Imagining the Future of Data Quality



Reader Comments (10)
Good question Jim. I sometimes compare our profession with that of dentists. Dentists are also believed to advocate for good habits around your teeth, but are making money when these good habits aren’t followed.
So when 4 out 5 dentists recommend a certain toothpaste, it is probably no good :-)
Seriously though, I take the amount of money spent on data quality tools as a sign that organizations believe there are issues best solved with technology. Of course these tools aren’t magic.
Data quality tools only solve a certain part of your data and information related challenges. On the other hand, the few problems they do solve may be solved very well and cannot be solved by any other line of products or in any practical way by humans in any quantity or quality.
Just my two cents, but most organizations don't need the help of an industry to struggle with their data.
With technology getting faster and faster and more and more all the time, it just gives us the opportunity to capture and create more and more data. Of course, some of it will be bad (and if you're a smart vendor you build that in for follow on work, just like a car manufacturer would . . .).
We now look at everything and try and capture as much information as we can, where before we would only capture what we truly had to. Now we capture everything and try to use analytics and tools, but if the data is bad it doesn't work.
As time goes on bad data is not tolerated as much and with legal requirements and rules, companies are getting serious about it. I do think there is more effort being put into data quality than there used to be because people are recognizing the value of their data AND the cost of bad data.
And yes, this has created more opportunities in the area of data quality that weren't there before. But I think it's a good thing overall that data quality is an issue that companies are spending money on because it means they are starting to care about their data . . . or as @datachick says: "Love Your Data!"
It seems that everyone we talk to tells us they see data quality problems, and very often data that is not in the right form or not structured well enough.
People screw up data in the same way that people screw up their cars, computers, teeth or anything else. If you aren't meticulous, and don't take proper care of things, you will eventually need an expert to fix your problems.
The same goes for data. Therefore, I see us (data quality engineers) as similar to mechanics, technicians, or dentists. It's an absolutely necessary job, it's here to stay, and probably in a big way.
Thanks for your comments, Henrik, Phil, Rob, and Dan.
Your feedback on this debate is great appreciated.
@Henrik — I really like the dentist analogy. If we took better care of our teeth, then there would be fewer dentists, at least in part because it would be a less profitable industry.
I agree with your positive spin on the amount of money spent on data quality tools, but we both know that there are those within the data quality/information quality industry that essentially declare these tools to be evil (and that's almost not an exaggeration) because they declare data cleansing (the primary, but not exclusive, use of data quality tools) is a considerable cost with little to no ROI (i.e., Scrap and Rework, over and over again).
Therefore, these "experts" advocate defect prevention as the only data quality practice, which if everyone followed their advice, would ironically make these same experts as obsolete as dentists if everyone took care of their teeth.
These "experts" also criticize data quality tool vendors for only trying to sell software, when they are hypocritically only trying to sell books, seminars, training classes, and consulting.
Sorry, I will stop now before I turn this debate into a rant — too late :-)
@Phil — I agree that most organizations don't need any help to struggle with their data, but I am pondering how much help the data quality industry is providing organizations to help them overcome these struggles, especially since the non-struggling organizations do not make the industry any money, i.e., they buy less software, consultancy, etc.
@Rob — Yes, you raise an excellent point about our increased (and increasing) storage and processing capacity, allowing us to capture virtually unlimited amounts of data, which has both increased the prevalence of data quality issues and the recognition of the importance of having high quality data.
However, I think one of the most important adjustments that the data quality industry needs to be helping organizations face is that we can NOT continue to try manage all our data — there is simply too much of it for that to be a feasible strategy because then organizations would be managing data for the sake of managing data, not for the sake of supporting improved business decisions, especially the low-latency decisions necessary for optimized business performance.
Not to be too cynical, but I think that the industry, especially the mega-vendors, are more interested in selling great big bags of technology with great big price tags, than helping organizations attempt to solve their data-related business problems.
@Dan — Although I agree that data quality is an absolutely necessary job, and it's here to stay, I can't help but point out that it sounds more like enabling language than problem solving language.
Again, I agree with you and I am not trying to be contentious just for the sake of it. However, although data quality may be a problem that can never be solved in an absolute way, I can't help but wonder if we are defining the problem in that way in order to perpetuate the profitability (ironically, for the "solution providers") of its continued existence.
I agree with these posts and also think that the expectations of clients from their data quality vendors have grown tremendously over the past few years. This is, of course, in line with most everything in the Web 2.0 cloud world that has become point-and-click, on-demand response.
In the olden days of 2002, I remember clients asking for vendors to adjust data only to the point where dashboard statistics could be presented on a clean Java user interface. I have noticed that some clients today want the software to not just run customizable reports, but to extract any form of data from any type of database, to perform advanced ETL and calculations with minimal user effort, and to be easy to use. It's almost like telling your dentist to fix your crooked teeth with no anesthesia, no braces, no pain, during a single office visit.
Of course, the reality today does not match the expectation, but data quality vendors and architects may need to step up their game to remain cutting edge.
Whilst I, and all the other consultants and data quality suppliers that I know, are definitely working to improve data quality, I do note that, in many cases, their marketing, sales and development departments are very happy to sell their products on the back of the latest buzz word or data quality fad. If the tool worked great for data management, then somebody coins the phrase "master data management", then selling the same (or re-branded) software as an MDM tool to the same market is not beyond the moral compass of most vendors.
I do not support those practices (especially as my business suffers because I'm too down to earth and don't have the patience to humor potential customers when they're buzzing around on their flights of fancies), but it is inevitable in an industry where we can't agree on definitions and where having a new buzz word every 6 months is de rigeur.
Shame we can't spend less time contemplating such fripperies and more actually resolving the issues!
From the LinkedIn Group for the IAIDQ Open Community, Javed (Jay) Zaidi commented:
“I'm of the opinion that vendors are in the business of selling products and the buyers have to ensure that they don't get caught up in the hype. As consumers, it is our responsibility to do the due diligence, utilize expert opinions and unbiased third parties to assist in the decision making process - and only purchase products and services that add value and significantly improve the status quo.”
From the LinkedIn Group for Data Governance & Data Quality, Angie Couron commented:
“Let me turn it around to another question: What COULD the Data Quality industry provide that would actually make this an easy question to answer in the positive?
My answer: Stop making companies pay you to learn!
I have been on one to many projects where the company had to pay you to learn that they had multi-byte data that gets goobered up during export/import routines because no one checked a setting. And it happened more than once on the project causing delays! Data that gets goobered up due to a foreign character in the address string (like a # sign) and it stops a load from running or drops all the data in the line after that.
This is not rocket science yet each time I deal with a consultant or a software program, there is no strategic experience offered to add value. Usually the consultant walks away and there is a huge gap in IT leaving no one with knowledge to run the data quality tool.
Data Quality tools are the only application type I'm aware of that companies don't require training or support to be in place before deployment. TCO on these tools MUST include the IP and the lights on support that go with them!”
And I responded:
There are many important considerations when implementing a new data quality tool, but I have also witnessed the tendency to not consider training, mentoring, and knowledge transfer an essential business requirement.
This was the subject of one of my earliest blog posts: Are You Afraid Of Your Data Quality Solution?
From the LinkedIn Group for the IAIDQ Open Community, C. Lwanga Yonke commented:
“Seems to me, the more the data quality industry can build and point to a growing list of documented successes (case studies of problems solved), the more demand there will be for its services since data quality problems are so pervasive.
So in other words, for the data quality industry, working hard to solve the problems of data quality may be the best way to ensure its long-term existence and profitability.”
And I responded:
I definitely agree with you that the best way for the data quality industry to ensure both its long-term existence and profitability is to focus on solving business problems caused by poor data quality.
And, like you said, let's hear more about the success stories and the case studies of the problems solved. Nowadays, we seem to hear only about the problems, but not the solutions (not counting the over-hyped sales message that buying the data quality tool is in itself the solution).
And Javed (Jay) Zaidi responded:
“At the end of the day, a tool is just that - it can't solve your problems by itself. So users should go into a tool purchase with their eyes open and not buy into the hype. I do agree that it would help tremendously, if we are able to see and hear of more successes with clearly defined business benefits.”