This Thursday is Thanksgiving Day, which in the United States is a holiday with a long, varied, and debated history. However, the most consistent themes remain family and friends gathering together to share a large meal and express their gratitude.
This is the eleventh entry in my ongoing series for expressing my gratitude to my readers for their commendable comments on my blog posts. Receiving comments is the most rewarding aspect of my blogging experience because not only do comments greatly improve the quality of my blog, comments also help me better appreciate the difference between what I know and what I only think I know. Which is why, although I am truly grateful to all of my readers, I am most grateful to my commenting readers.
“Recently got to listen in on a ‘cooperate or not’ discussion. (Not my clients.) What struck me was that the people advocating cooperation were big-picture people (from architecture and process) while those who just wanted what they wanted were more concerned about their own short-term gains than about system health. No surprise, right?
But what was interesting was that they were clearly looking after their own careers, and not their silos’ interests. I think we who help focus and frame the Stakeholder’s Dilemma situations need to be better prepared to address the individual people involved, and not just the organizational roles they represent.”
“As always, an intriguing post. Especially where you draw a parallel between Data Governance and Knowledge Management (wisdom management?) We sometimes portray data management (current term) as ‘well managed data administration’ (term from 70s-80s). As for the debate on ‘data’ and ‘information’ I prefer to see everything written, drawn and / or stored on paper or in digital format as data with various levels of informational value, depending on the amount and quality of metadata surrounding the data item and the accessibility, usefulness (quality) of that item.
For example, 12024561414 is a number with low informational value. I could add metadata, for instance: ‘Phone number’, that makes it potentially known as a phone number. Rather than let you find out whose number it is we could add more information value and add more metadata like: ‘White House Switchboard’. Accessibility could be enhanced by improving formatting like: (1) 202-456-1414.
What I am trying to say with this example is that data items should be placed on a rising scale of informational value rather than be put on steps or firm levels of informational value. So the Information Hierarchy provided by Professor Larson does not work very well for me. It could work only if for all data items the exact information value was determined for every probable context. This model is useful for communication purposes.”
“‘erised stra ehru oyt ube cafru oyt on wohsi.’
To all Harry Potter fans this translates to: ‘I show not your face but your heart’s desire.’
It refers to The Mirror of Erised. It does not reflect reality but what you desire. (Erised is Desired spelled backwards.) Often data will cast a reflection of what people want to see.
‘Dumbledore cautions Harry that the mirror gives neither knowledge nor truth and that men have wasted away before it, entranced by what they see.’ How many systems are really Mirrors of Erised?”
“Because the prisoners in the cave are chained and unable to turn their heads to see what goes on behind them, they perceive the shadows as reality. They perceive imperfect reflections of truth and reality.
Bringing the allegory to modern times, this serves as a good reminder that companies MUST embrace data quality for an accurate and REAL view of customers, business initiatives, prospects, and so on. Continuing to view half-truths based on possibly faulty data and information means you are just lost in a dark cave!
I also like the comparison to the Mirror of Erised. One of my favorite movies is the Matrix, in which there are also a lot of parallelisms to Plato’s Cave Allegory. As Morpheus says to Neo: ‘That you are a slave, Neo. Like everyone else you were born into bondage. Into a prison that you cannot taste or see or touch. A prison for your mind.’ Once Neo escapes the Matrix, he discovers that his whole life was based on shadows of the truth.
Plato, Harry Potter, and Morpheus — I’d love to hear a discussion between the three of them in a cave!”
“It is true that data is only a reflection of reality but that is also true of anything that we perceive with our senses. When the prisoners in the cave turn around, what they perceive with their eyes in the visible spectrum is only a very narrow slice of what is actually there. Even the ‘solid’ objects they see, and can indeed touch, are actually composed of 99% empty space.
The questions that need to be asked and answered about the essence of data quality are far less esoteric than many would have us believe. They can be very simple, without being simplistic. Indeed simplicity can be seen as a cornerstone of true data quality. If you cannot identify the underlying simplicity that lies at the heart of data quality you can never achieve it. Simple questions are the most powerful. Questions like, ‘In our world (i.e., the enterprise in question) what is it that we need to know about (for example) a Sale that will enable us to operate successfully and meet all of our goals and objectives?’ If the enterprise cannot answer such simple questions then it is in trouble. Making the questions more complicated will not take the enterprise any closer to where it needs to be. Rather it will completely obscure the goal.
Data quality is rather like a ‘magic trick’ done by a magician. Until you know how it is done it appears to an unfathomable mystery. Once you find out that is merely an illusion, the reality is absolutely simple and, in fact, rather mundane. But perhaps that is why so many practitioners perpetuate the illusion. It is not for self gain. They just don’t want to tell the world that, when it comes to data quality, there is no Tooth Fairy, no Easter Bunny, or no Santa Claus. It’s sad, but true. Data quality is boringly simple!”
“Actually I would go substantially further, whereas data was originally no more than a representation of the real world and if validation was required the real world was the ‘authoritative source’ — but that is clearly no longer the case. Data is in fact the new reality!
Data is now used to track everything, if the data is wrong the real world item disappears. It may have really been destroyed or it may be simply lost, but it does not matter, if the data does not provide evidence of its existence then it does not exist. If you doubt this, just think of money, how much you have is not based on any physical object but on data.
By the way the theoretical definition I use for data is as follows:
Datum — a disruption in a continuum.
The practical definition I use for data is as follows:
Data — elements into which information is transformed so that it can be stored or moved.”
“We can see that there’s a trench between those who think adjacent means out of scope and those who think it means opportunity. Great leaders know that good stories make for better governance for an organization that needs to adapt and evolve, but stay true to its mission. Built from, but not about, real facts, good fictions are broadly true without being specifically true, and therefore they carry well to adjacent business processes where their truths can be applied to making improvements.
On the other hand, if it weren’t for nonfiction — accounts of real markets and processes — there would be nothing for the POSSIBLE to be adjacent TO. Managers often have trouble with this because they feel called to manage the facts, and call anything else an airy-fairy waste of time.
So a data governance program needs to assert whether its purpose is to fix the status quo only, or to fix the status quo in order to create agility to move into new areas when needed. Each of these should have its own business case and related budgets and thresholds (tolerances) in the project plan. And it needs to choose its sponsorship and data quality players accordingly.”
“I’ve been working on a definitive solution for the data / information / metadata / attributes / properties knot for a while now and I think I have it figured out.
I read your blog entitled The Semantic Future of MDM and we share the same philosophy even while we differ a bit on the details. Here goes. It’s all information. Good, bad, reliable or not, the argument whether data is information or vice versa is not helpful. The reason data seems different than information is because it has too much ambiguity when it is out of context. Data is like a quantum wave: it has many possibilities one of which is ‘collapsed’ into reality when you add context. Metadata is not a type of data, any more than attributes, properties or associations are a type of information. These are simply conventions to indicate the role that information is playing in a given circumstance.
Your Michelle Davis example is a good illustration: Without context, that string could be any number of individuals, so I consider it data. Give it a unique identifier and classify it as a digital representation in the class of Person, however and we have information. If I then have Michelle add attributes to her personal record — like sex, age, etc. — and assuming that these are likewise identified and classed — now Michelle is part of a set, or relation. Note that it is bad practice — and consequently the cause of many information management headaches — to use data instead of information. Ambiguity kills. Now, if I were to use Michelle’s name in a Subject Matter Expert field as proof of the validity of a digital asset; or in the Author field as an attribute, her information does not *become* metadata or an attribute: it is still information. It is merely being used differently.
In other words, in my world while the terms ‘data’ and ‘information’ are classified as concepts, the terms ‘metadata’, ‘attribute’ and ‘property’ are classified as roles to which instances of those concepts (well, one of them anyway) can be put, i.e., they are fit for purpose. This separation of the identity and class of the string from the purpose to which it is being assigned has produced very solid results for me.”
Thanks for giving your comments
Thank you very much for giving your comments and sharing your perspectives with our collablogaunity. This entry in the series highlighted commendable comments on OCDQ Blog posts published between July and November of 2011.
Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.
Please keep on commenting and stay tuned for future entries in the series.
Thank you for reading the Obsessive-Compulsive Data Quality (OCDQ) blog. Your readership is deeply appreciated.