Beyond a “Single Version of the Truth”
Jim Harris in
Blogs,
Books,
Data Quality,
Debates tagged
Best of 2009,
Blog-Bout,
Philosophy,
Thomas Redman
Thursday, November 12, 2009 at 5:15AM This post is involved in a good-natured contest (i.e., a blog-bout) with two additional bloggers: Henrik Liliendahl Sørensen and Charles Blyth. Our contest is a Blogging Olympics of sorts, with the United States, Denmark, and England competing for the Gold, Silver, and Bronze medals in an event we are calling “Three Single Versions of a Shared Version of the Truth.”
Please take the time to read all three posts and then vote for who you think has won the debate (see poll below). Thanks!
The “Point of View” Paradox
In the early 20th century, within his Special Theory of Relativity, Albert Einstein introduced the concept that space and time are interrelated entities forming a single continuum, and therefore the passage of time can be a variable that could change for each individual observer.
One of the many brilliant insights of special relativity was that it could explain why different observers can make validly different observations – it was a scientifically justifiable matter of perspective.
It was Einstein's apprentice, Obi-Wan Kenobi (to whom Albert explained “Gravity will be with you, always”), who stated:
“You're going to find that many of the truths we cling to depend greatly on our own point of view.”
The Data-Information Continuum
In the early 21st century, within his popular blog post The Data-Information Continuum, Jim Harris introduced the concept that data and information are interrelated entities forming a single continuum, and that speaking of oneself in the third person is the path to the dark side.
I use the Dragnet definition for data – it is “just the facts” collected as an abstract description of the real-world entities that the enterprise does business with (e.g., customers, vendors, suppliers).
Although a common definition for data quality is fitness for the purpose of use, the common challenge is that data has multiple uses – each with its own fitness requirements. Viewing each intended use as the information that is derived from data, I define information as data in use or data in action.
Quality within the Data-Information Continuum has both objective and subjective dimensions. Data's quality is objectively measured separate from its many uses, while information's quality is subjectively measured according to its specific use.
Objective Data Quality
Data quality standards provide a highest common denominator to be used by all business units throughout the enterprise as an objective data foundation for their operational, tactical, and strategic initiatives.
In order to lay this foundation, raw data is extracted directly from its sources, profiled, analyzed, transformed, cleansed, documented and monitored by data quality processes designed to provide and maintain universal data sources for the enterprise's information needs.
At this phase of the architecture, the manipulations of raw data must be limited to objective standards and not be customized for any subjective use. From this perspective, data is now fit to serve (as at least the basis for) each and every purpose.
Subjective Information Quality
Information quality standards (starting from the objective data foundation) are customized to meet the subjective needs of each business unit and initiative. This approach leverages a consistent enterprise understanding of data while also providing the information necessary for day-to-day operations.
But please understand: customization should not be performed simply for the sake of it. You must always define your information quality standards by using the enterprise-wide data quality standards as your initial framework.
Whenever possible, enterprise-wide standards should be enforced without customization. The key word within the phrase “subjective information quality standards” is standards — as opposed to subjective, which can quite often be misinterpreted as “you can do whatever you want.” Yes you can – just as long as you have justifiable business reasons for doing so.
This approach to implementing information quality standards has three primary advantages. First, it reinforces a consistent understanding and usage of data throughout the enterprise. Second, it requires each business unit and initiative to clearly explain exactly how they are using data differently from the rest of your organization, and more important, justify why. Finally, all deviations from enterprise-wide data quality standards will be fully documented.
The “One Lie Strategy”
A common objection to separating quality standards into objective data quality and subjective information quality is the enterprise's significant interest in creating what is commonly referred to as a “Single Version of the Truth.”
However, in his excellent book Data Driven: Profiting from Your Most Important Business Asset, Thomas Redman explains:
“A fiendishly attractive concept is...'a single version of the truth'...the logic is compelling...unfortunately, there is no single version of the truth.
For all important data, there are...too many uses, too many viewpoints, and too much nuance for a single version to have any hope of success.
This does not imply malfeasance on anyone's part; it is simply a fact of life.
Getting everyone to work from a single version of the truth may be a noble goal, but it is better to call this the 'one lie strategy' than anything resembling truth.”
Beyond a “Single Version of the Truth”
In the classic 1985 film Mad Max Beyond Thunderdome, the title character arrives in Bartertown, ruled by the evil Auntie Entity, where people living in the post-apocalyptic Australian outback go to trade for food, water, weapons, and supplies. Auntie Entity forces Mad Max to fight her rival Master Blaster to the death within a gladiator-like arena known as Thunderdome, which is governed by one simple rule:
“Two men enter, one man leaves.”
I have always struggled with the concept of creating a “Single Version of the Truth.” I imagine all of the key stakeholders from throughout the enterprise arriving in Corporatetown, ruled by the Machiavellian CEO known only as Veritas, where all business units and initiatives must go to request funding, staffing, and continued employment. Veritas forces all of them to fight their Master Data Management rivals within a gladiator-like arena known as Meetingdome, which is governed by one simple rule:
“Many versions of the truth enter, a Single Version of the Truth leaves.”
For any attempted “version of the truth” to truly be successfully implemented within your organization, it must take into account both the objective and subjective dimensions of quality within the Data-Information Continuum.
Both aspects of this shared perspective of quality must be incorporated into a “Shared Version of the Truth” that enforces a consistent enterprise understanding of data, but that also provides the information necessary to support day-to-day operations.
The Data-Information Continuum is governed by one simple rule:
“All validly different points of view must be allowed to enter,
In order for an all encompassing Shared Version of the Truth to be achieved.”
You are the Judge
This post is involved in a good-natured contest (i.e., a blog-bout) with two additional bloggers: Henrik Liliendahl Sørensen and Charles Blyth. Our contest is a Blogging Olympics of sorts, with the United States, Denmark, and England competing for the Gold, Silver, and Bronze medals in an event we are calling “Three Single Versions of a Shared Version of the Truth.”
Please take the time to read all three posts and then vote for who you think has won the debate. A link to the same poll is provided on all three blogs. Therefore, wherever you choose to cast your vote, you will be able to view an accurate tally of the current totals.
The poll will remain open for one week, closing at midnight on November 19 so that the “medal ceremony” can be conducted via Twitter on Friday, November 20. Additionally, please share your thoughts and perspectives on this debate by posting a comment below. Your comment may be copied (with full attribution) into the comments section of all of the blogs involved in this debate.
Related Posts
The General Theory of Data Quality
The Data-Information Continuum
Reader Comments (11)
Jim, reinstalling your take on the data-information continuum having the dimensions of objective data quality and subjective information quality is a powerful and feared weapon in our blog contest.
Adding your well known combination of Hollywood and Einstein makes it very hard to beat you in the voting.
Excellent post as ever.
Thanks Henrik,
I honestly believe that your entry was the best in the blog-bout:
Sharing data is key to a single version of the truth
I think Charles (Tell me the Truth!) did an excellent job presenting why a "Single Version of the Truth" is such an important and persuasive topic and I presented the opposite viewpoint.
Whereas your blog post struck the right balance between the two.
All in all, I am very pleased with our collective contribution to this discussion.
Best Regards,
Jim
Jim,
I really like your definition of information as data in use or data in action, and have been a fan of your 'Data-Information Continuum' since I first read it. I must admit though, I like continuums (weird I know!!!)
You have a very balanced view here, which is great. As I responded on your comments on my blog, I think we have achieved something here that can be taken further, if you break down what each of us has said in the 'contest' I think that you will have a 'toolbox' of responses for a wide range of audiences, which is something that is critical for getting buy-in on any Data Governance initiative.
I think there is further discussion required on your definition of “subjective information quality standards”, my “contextual single version of the truth” and Henrik's “making data fit for multiple intended uses of shared data in the enterprise”.
I see a lot of similarities here worth discussing.
Cheers,
Charles
Thanks Charles,
Who doesn't like continuums? I wish I had named my blog the Data-Information Continuum!
(However, it doesn't make for a good acronym like Obsessive-Compulsive Data Quality does.)
Between the three of us, I honestly believe that my material was the weakest. I kind of "phoned it in" so to speak since I basically just repackaged existing content that I already had lying around my blog. You and Henrik did a much better job at bringing a fresher and more well thought out approach to this discussion.
Your point about contributing to a 'toolbox' of responses for diverse audiences was what I hoped we would set in motion with this debate. On that front, I consider this joint-venture to already be a success - regardless of who ultimately brings home the gold - which, by the way, I am so totally gonna win ;-)
And I definitely agree with you (and Henrik) that we have much more to discuss - a "Single Version of the Truth" is one of the many "Neverending Stories" in our profession because basically the discussion never truly ends.
Best Regards,
Jim
I would be remiss if I did not acknowledge Bryan Scott Larkin over at the Dirty Data Donkey Blog for his timely and excellent blog post contribution to this discussion:
The Data Quality Blog Olympics
(Please imagine the voice of Shrek saying this and in the most positive way possible):
Well done, Donkey!
P.S. If you couldn't tell, I am a huge fan of the Shrek movies! :-)
My sleep deprived mind is withering, must make it to the weekend...
Okay, great discussion guys, loving the different styles and views coming through, Bryan has gone and commented so I've got to delve into that now, last thing we need is new runners on the track!
I'm a single version of the truth kind of guy. I think we often use the subjective nature argument to denote information that really should be classed as a seperate fact. When two different people interpret the same data differently and use it for different purposes - are we not observing the need for a new fact to be defined?
Then again, apologies if this is where you were driving from, I think I've fallen into a sleep deprivation continuum as a result of our recent new family addition.
Let's face it, anyone who can include Mad Max, Einstein, Tom Redman and Obi Wan Kenobi in less than 8 paragraphs surely deserves a vote?
Neck and neck to the post is my prediction...
Thanks Dylan,
You make excellent sense for someone working with a total of six hours of sleep in the past three days!
My concern is the overuse of the "subjective nature argument" to preserve a "Single Version of the Truth" by allowing everyone to define a new fact. As I stated in the post, "subjective" is often be misinterpreted as "you can do whatever you want." Yes you can – just as long as you have justifiable business reasons for doing so.
When "new facts" are created without verifying if it really is a new fact (yes, many people define, for example Revenue in different ways, however some people define it the same way but still want to create a new definition so it can be "their definition") AND without providing business justification for whether it should be a new fact, then the "Single Version of the Truth" just implodes back into the "Siloed Version of Everybody has their own Version of the Truth."
I believe that emphasizing a "Shared Version of the Truth" that allows for a consistent enterprise understanding of data, but that also provides for the necessary flexibility for the enterprise to function day-to-day and adapt more quickly when changes have to be made, is a far more effective framework for success.
Best Regards,
Jim
Over on Henrik's blog, Dean Groves contributed the following great comment:
Henrik, Charles and Jim,
Thanks for a particularly interesting and enjoyable event. This topic will surely generate useful debate for some time.
Here’s what occurred to me as I read your entries:
What if we changed “version” to “vision”?
Would we be more comfortable with a “single vision of the truth” than we are with a “single version”?
In this context, “vision” implies that, whether or not there exists an absolute truth, we acknowledge that we’re human, and that we succeed together to the extent that we agree on what’s true.
What is the truth anyway?
What is proven to be true today will be disproven tomorrow.
The goalposts are always changing. We never really know the whole truth we just get closer to it as new evidence emerges. By the time we've done arguing over what the single vision/version is, it will have changed.
Data Quality is dead, long live Data Quality.
Thanks to Phil Allen for an insightful comment with an amusing conclusion (those are my favorite kind).
And special thanks to Rich Murnane for his excellent blog post contribution to this discussion:
Single Version of the Truth? An online battle of wit and knowledge sharing for those interested in Data Quality
Thanks again to everyone who voted - and not just those who voted for me :-)
Please check out these related blog posts:
The Results are in by Charles Blyth
The truth the whole truth and nothing but the truth by Ken O'Connor
Single Version of the Truth VS Single Interpretation of the Truth by Bryan Scott Larkin