So, absolutely without question, there is no better way to commemorate this milestone other than to also make this the 12th entry in my ongoing series for expressing my gratitude to my readers for their truly commendable comments on my blog posts.
“I think this helps illustrate that one size does not fit all.
You can’t take a singular approach to how you design for big data. It’s all about identifying relevance and understanding that relevance can change over time.
There are certain situations where it makes sense to leverage all of the data, and now with high performance computing capabilities that include in-memory, in-DB and grid, it's possible to build and deploy rich models using all data in a short amount of time. Not only can you leverage rich models, but you can deploy a large number of models that leverage many variables so that you get optimal results.
On the other hand, there are situations where you need to filter out the extraneous information and the more intelligent you can be about identifying the relevant information the better.
The traditional approach is to grab the data, cleanse it, and land it somewhere before processing or analyzing the data. We suggest that you leverage analytics up front to determine what data is relevant as it streams in, with relevance based on your organizational knowledge or context. That helps you determine what data should be acted upon immediately, where it should be stored, etc.
And, of course, there are considerations about using visual analytic techniques to help you determine relevance and guide your analysis, but that’s an entire subject just on its own!”
On Data Governance Frameworks are like Jigsaw Puzzles, Gabriel Marcan commented:
“I agree (and like) the jigsaw puzzles metaphor. I would like to make an observation though:
Can you really construct Data Governance one piece at a time?
I would argue you need to put together sets of pieces simultaneously, and to ensure early value, you might want to piece together the interesting / easy pieces first.
Hold on, that sounds like the typical jigsaw strategy anyway . . . :-)”
On Data Governance Frameworks are like Jigsaw Puzzles, Doug Newdick commented:
“I think that there are a number of more general lessons here.
In particular, the description of the issues with data governance sounds very like the issues with enterprise architecture. In general, there are very few eureka moments in solving the business and IT issues plaguing enterprises. These solutions are usually 10% inspiration, 90% perspiration in my experience. What looks like genius or a sudden breakthrough is usually the result of a lot of hard work.
I also think that there is a wider Myth of the Framework at play too.
The myth is that if we just select the right framework then everything else will fall into place. In reality, the selection of the framework is just the start of the real work that produces the results. Frameworks don’t solve your problems, people solve your problems by the application of brain-power and sweat.
All frameworks do is take care of some of the heavy-lifting, i.e., the mundane foundational research and thinking activity that is not specific to your situation.
Unfortunately the myth of the framework is why many organizations think that choosing TOGAF will immediately solve their IT issues and are then disappointed when this doesn’t happen, when a more sensible approach might have garnered better long-term success.”
“I agree with everything you’ve said, but there’s a much uglier truth about data quality that should also be discussed — the business benefit of NOT having a data quality program.
The unfortunate reality is that in a tight market, the last thing many decision makers want to be made public (internally or externally) is the truth.
In a company with data quality principles ingrained in day-to-day processes, and reporting handled independently, it becomes much harder to hide or reinterpret your falling market share. Without these principles though, you’ll probably be able to pick your version of the truth from a stack of half a dozen, then spend your strategy meeting discussing which one is right instead of what you’re going to do about it.
What we’re talking about here is the difference between a Politician — who will smile at the camera and proudly announce 0.1% growth was a fantastic result given X, Y, and Z factors — and a Statistician who will endeavor to describe reality with minimal personal bias.
And the larger the organization, the more internal politics plays a part. I believe a lot of the reluctance in investing in data quality initiatives could be traced back to this fear of being held truly accountable, regardless of it being in the best interests of the organization. To build a data quality-centric culture, the change must be driven from the CEO down if it’s to succeed.”
“The question: ‘Is Data Quality a Journey or a Destination?’ suggests that it is one or the other.
I agree with another comment that data quality is neither . . . or, I suppose, it could be both (the journey is the destination and the destination is the journey. They are one and the same.)
The quality of data (or anything for that matter) is something we experience.
Quality only radiates when someone is in the act of experiencing the data, and usually only when it is someone that matters. This radiation decays over time, ranging from seconds or less to years or more.
The only problem with viewing data quality as radiation is that radiation can be measured by an instrument, but there is no such instrument to measure data quality.
We tend to confuse data qualities (which can be measured) and data quality (which cannot).
In the words of someone whose name I cannot recall: ‘Quality is not job one. Being totally %@^#&$*% amazing is job one.’ The only thing I disagree with here is that being amazing is characterized as a ‘job.’
Data quality is not something we ‘do’ to data. It’s not a business initiative or project or job. It’s not a discipline. We need to distinguish between the pursuit (journey) of being amazing and actually being amazing (destination — but certainly not a final one). To be amazing requires someone to be amazed. We want data to be continuously amazing . . . to someone that matters, i.e., someone who uses and values the data a whole lot for an end that makes a material difference.
Come to think of it, the only prerequisite for data quality is being alive because that is the only way to experience it. If you come across some data and have an amazed reaction to it and can make a difference using it, you cannot help but experience great data quality. So if you are amazing people all the time with your data, then you are doing your data quality job very well.”
“Nicely delineated argument, Jim. Successfully starting a data quality program seems to be a balance between getting started somewhere and determining where best to start. The data quality problem is like a two-edged sword without a handle that is inflicting the ‘death of a thousand cuts’.
Data quality is indeed difficult to get ‘a handle on’.”
And since they generated so much great banter, please check out all of the commendable comments received by the blog posts There is No Such Thing as a Root Cause and You only get a Return from something you actually Invest in.
Thank You for Three Awesome Years
You are Awesome — which is why receiving your comments has been the most rewarding aspect of my blogging experience over the last three years. Even if you have never posted a comment, you are still awesome — feel free to tell everyone I said so.
This entry in the series highlighted commendable comments on blog posts published between December 2011 and March 2012.
Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.
Please continue commenting and stay tuned for future entries in the series.
Thank you for reading the Obsessive-Compulsive Data Quality blog for the last three years. Your readership is deeply appreciated.