Redefining Data Quality
Jim Harris in
Blogs,
Data Quality,
Debates,
OCDQ Radio,
Podcasts tagged
Data Governance,
Master Data Management,
Philosophy
Monday, December 19, 2011 at 3:00AM OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.
During this episode, I have an occasionally spirited discussion about data quality with Peter Perera, partially precipitated by his provocative post from this past summer, The End of Data Quality...as we know it, which included his proposed redefinition of data quality, as well as his perspective on the relationship of data quality to master data management and data governance.
Peter Perera is a recognized consultant and thought leader with significant experience in Master Data Management, Customer Relationship Management, Data Quality, and Customer Data Integration. For over 20 years, he has been advising and working with Global 5000 organizations and mid-size enterprises to increase the usability and value of their customer information.

Redefining Data Quality
Additional listening options:
Related Posts
You Say Potato and I Say Tater Tot
You only get a Return from something you actually Invest in
Listen to John Ladley discuss why Data and Information are Enterprise Assets on OCDQ Radio
Listen to Daragh O Brien discuss Data and Information Quality on OCDQ Radio
Listen to Gordon Hamilton discuss the Information Product on OCDQ Radio
Listen to Peter Benson discuss Metadata, Data, and Information on the Knights of the Data Roundtable
Data, Information, and Knowledge Management



Reader Comments (4)
Hi Jim,
Very interesting broadcast. One of the things that really stood out was the confusion between the objective and the subjective quality of data.
The woolliness that surrounds the subjective quality of data is easily dispelled when you introduce Function, i.e., what the data is going to be used for.
Your example about recruitment makes this quite clear. Do you need to the know the physical address of an applicant for a job posting? When looking at the data on its own it is not possible to say either way. You need to look at the Business Functions and the Business Rules within these Functions.
For the Business Function, "Confirm Receipt of Application", the physical address might not be required, as a reply to an e-mail address might suffice - if the business rule allowed that.
For the Business Function, "Select Applicants for Interview", the physical address would be of relevance if the business rule restricted invitations for certain categories of job postings to applicants living within, say, a 200km radius of the intended workplace. However, in order to make this selection, the street address might not be required as the City Name and State would suffice. So here, by knowing the Business Function, we know when data needs to be known and precisely what that data needs be.
The same is true when you talk about the quality of the experience that, say, a customer will have when dealing with the enterprise. Knowing the Business Functions and having the data to support them will enable the Customer to be provided a positive experience.
The major reason that the Data Quality world has such difficulty with the subjective side of data quality is that they have created this subjective element by trying to define and understand data in isolation from Function.
The have forgotten Data Rule 101: The only purpose of data in any enterprise is to support the Business Functions of that enterprise! and Data Rule 102: There are NO exceptions to Data Rule 101!! This applies in the worlds of Data Quality, Data Governance, and Master Data Management.
Regards,
John
From the LinkedIn Group for Data Quality Pro.com, Lisa Marie Martinez commented:
“Awesome! Assume I am respectful and understand the importance of fit for purpose and agree it's subjective or functional.
I am effective; understanding the real world objective (C-Level and Board) domain.
My expertise would be to insert the real world without disrupting the fit for purpose. Because, I have end to end and both business and technology and grew up connected (system) through effective business process design, re-engineer and run through to operate.
I've been struggling with a simple way to say what you both gracefully empowered me with on this radio broadcast.
BRAVO again.”
Our attempts to discuss a “Definition of Data Quality” presume that quality can be defined at all.
As a casual follower of Robert Pirsig’s thinking on quality, he would call me “stupid” for trying to “re-define” it, as gallant, if not stupid, as that attempt may be. According to Pirsig, “There is, in fact, no formal difference between inability to define (Quality) and stupidity. When I say, 'Quality cannot be defined,' I’m really saying formally, 'I’m stupid about Quality.'”
Pirsig further states:
“Because if Quality exists in the object, then you must explain just why scientific instruments are unable to detect it. You must suggest instruments that will detect it, or live with the explanation that instruments don’t detect it because your whole Quality concept, to put it politely, is a large pile of nonsense.”
In speaking about Phædrus, the main character in his cult book Zen and the Art of Motorcycle Maintenance, Pirsig states:
“On the other hand, if Quality is subjective, existing only in the observer, then this Quality that you make so much of is just a fancy name for whatever you like…If he (Phædrus) accepted the premise that Quality was objective, he was impaled on one horn of the dilemma. If he accepted the other premise that Quality was subjective, he was impaled on the other horn. Either Quality is objective or subjective, therefore he was impaled no matter how he answered.
The first horn (Objectivity) of Phædrus’ dilemma was, If Quality exists in the object, why can’t scientific instruments detect it? This horn was the mean one. From the start he saw how deadly it was. If he was going to presume to be some super-scientist who could see in objects Quality that no scientist could detect, he was just proving himself to be a nut or a fool or both. In today’s world, ideas that are incompatible with scientific knowledge don’t get off the ground.”
(Side observation: Objective implies we are talking about an object, which I do not believe data is, complicating the whole notion of data quality even more so.)
“...no object, scientific or otherwise, is knowable except in terms of its qualities. This irrefutable truth seemed to suggest that the reason scientists cannot detect Quality in objects is because Quality is all they detect. The 'object' is an intellectual construct deduced from the qualities.”
(In the case of data, qualities — not quality — are accuracy, completeness, consistency, relevance, yada, yada, yada.)
Pirsig concludes: “A third rhetorical alternative to the dilemma, and the best one in my opinion, was to refuse to enter the arena. Phædrus could simply have said, 'The attempt to classify Quality as subjective or objective is an attempt to define it. I have already said it is undefinable," and left it at that...'”
Pirsig continues:
“...it (Quality) wasn’t subjective or objective either, it was beyond both of those categories. Actually this whole dilemma of subjectivity-objectivity, of mind-matter, with relationship to Quality was unfair. That mind-matter relationship has been an intellectual hang-up for centuries... And so: he rejected the left horn. Quality is not objective, he said. It doesn’t reside in the material world. Then: he rejected the right horn. Quality is not subjective, he said. It doesn’t reside merely in the mind...And finally: Phædrus, following a path that to his knowledge had never been taken before in the history of Western thought, went straight between the horns of the subjectivity-objectivity dilemma and said Quality is neither a part of mind, nor is it a part of matter. It is a third entity which is independent of the two...
The world now, according to Phædrus, was composed of three things: mind, matter, and Quality...Quality couldn’t be independently...related with either the subject or the object but could be found only in the relationship of the two with each other. It is the point at which subject and object meet...Quality is not a thing. It is an event.”
My proposed re-definition for data quality, “the best usable valued data” simply seeks to recognize data quality as a perception, where any subjectivity and objectivity are entwined and are only liberated into separate concepts for us to discuss once it is experienced. The quintessential notion of pornography, “I know it when I see it” equally applies to data quality. To define data quality in subjective and objective terms is irrelevant.
To me, Quality is a radiation, emanating from our thinking and interpretations, which because of our (or at least my) mental limitations, we have to reduce to a debate over which is more important — Subjectivity or Objectivity?
(Thanks to “Quality: A Challenge for Healthcare” by The Telmarc Group for pulling together this coherent collection of Robert Pirsig’s thoughts about quality.)
Peter,
Talking about Data Quality objectively in no way implies that one is suggesting that data is a physical object. An 'Objective View' is an opinion not influenced by personal feelings, interpretations, or prejudice; rather an unbiased opinion based on facts. A 'Subjective View' is, obviously, the opposite of this.
Without reference to the purpose of the data (i.e., the Business Functions it is meant to support) under discussion, all opinions on its quality are, necessarily, subjective.
Because so many (nearly all) discussions on Data Quality are carried out in the vacuum of Functionless worlds, nearly all data quality opinions are subjective.
Quality, in any field, can never be achieved if its definition is expressed in subjective terms. Data quality, expressed without the context of Business Functions, will always be subjective and, consequently, can never be attained.
Regards,
John