Commendable Comments (Part 11)

This Thursday is Thanksgiving Day, which in the United States is a holiday with a long, varied, and debated history.  However, the most consistent themes remain family and friends gathering together to share a large meal and express their gratitude.

This is the eleventh entry in my ongoing series for expressing my gratitude to my readers for their commendable comments on my blog posts.  Receiving comments is the most rewarding aspect of my blogging experience because not only do comments greatly improve the quality of my blog, comments also help me better appreciate the difference between what I know and what I only think I know.  Which is why, although I am truly grateful to all of my readers, I am most grateful to my commenting readers.

 

Commendable Comments

On The Stakeholder’s DilemmaGwen Thomas commented:

“Recently got to listen in on a ‘cooperate or not’ discussion.  (Not my clients.) What struck me was that the people advocating cooperation were big-picture people (from architecture and process) while those who just wanted what they wanted were more concerned about their own short-term gains than about system health.  No surprise, right?

But what was interesting was that they were clearly looking after their own careers, and not their silos’ interests.  I think we who help focus and frame the Stakeholder’s Dilemma situations need to be better prepared to address the individual people involved, and not just the organizational roles they represent.”

On Data, Information, and Knowledge ManagementFrank Harland commented:

“As always, an intriguing post. Especially where you draw a parallel between Data Governance and Knowledge Management (wisdom management?)  We sometimes portray data management (current term) as ‘well managed data administration’ (term from 70s-80s).  As for the debate on ‘data’ and ‘information’ I prefer to see everything written, drawn and / or stored on paper or in digital format as data with various levels of informational value, depending on the amount and quality of metadata surrounding the data item and the accessibility, usefulness (quality) of that item.

For example, 12024561414 is a number with low informational value. I could add metadata, for instance: ‘Phone number’, that makes it potentially known as a phone number.  Rather than let you find out whose number it is we could add more information value and add more metadata like: ‘White House Switchboard’.  Accessibility could be enhanced by improving formatting like: (1) 202-456-1414.

What I am trying to say with this example is that data items should be placed on a rising scale of informational value rather than be put on steps or firm levels of informational value.  So the Information Hierarchy provided by Professor Larson does not work very well for me.  It could work only if for all data items the exact information value was determined for every probable context.  This model is useful for communication purposes.”

On Plato’s DataPeter Perera commented:

“‘erised stra ehru oyt ube cafru oyt on wohsi.’

To all Harry Potter fans this translates to: ‘I show not your face but your heart’s desire.’

It refers to The Mirror of Erised.  It does not reflect reality but what you desire. (Erised is Desired spelled backwards.)  Often data will cast a reflection of what people want to see.

‘Dumbledore cautions Harry that the mirror gives neither knowledge nor truth and that men have wasted away before it, entranced by what they see.’  How many systems are really Mirrors of Erised?”

On Plato’s DataLarisa Bedgood commented:

“Because the prisoners in the cave are chained and unable to turn their heads to see what goes on behind them, they perceive the shadows as reality.  They perceive imperfect reflections of truth and reality.

Bringing the allegory to modern times, this serves as a good reminder that companies MUST embrace data quality for an accurate and REAL view of customers, business initiatives, prospects, and so on.  Continuing to view half-truths based on possibly faulty data and information means you are just lost in a dark cave!

I also like the comparison to the Mirror of Erised.  One of my favorite movies is the Matrix, in which there are also a lot of parallelisms to Plato’s Cave Allegory.  As Morpheus says to Neo: ‘That you are a slave, Neo.  Like everyone else you were born into bondage.  Into a prison that you cannot taste or see or touch.  A prison for your mind.’  Once Neo escapes the Matrix, he discovers that his whole life was based on shadows of the truth.

Plato, Harry Potter, and Morpheus — I’d love to hear a discussion between the three of them in a cave!”

On Plato’s DataJohn Owens commented:

“It is true that data is only a reflection of reality but that is also true of anything that we perceive with our senses.  When the prisoners in the cave turn around, what they perceive with their eyes in the visible spectrum is only a very narrow slice of what is actually there.  Even the ‘solid’ objects they see, and can indeed touch, are actually composed of 99% empty space.

The questions that need to be asked and answered about the essence of data quality are far less esoteric than many would have us believe.  They can be very simple, without being simplistic.  Indeed simplicity can be seen as a cornerstone of true data quality.  If you cannot identify the underlying simplicity that lies at the heart of data quality you can never achieve it.  Simple questions are the most powerful.  Questions like, ‘In our world (i.e., the enterprise in question) what is it that we need to know about (for example) a Sale that will enable us to operate successfully and meet all of our goals and objectives?’  If the enterprise cannot answer such simple questions then it is in trouble.  Making the questions more complicated will not take the enterprise any closer to where it needs to be.  Rather it will completely obscure the goal.

Data quality is rather like a ‘magic trick’ done by a magician.  Until you know how it is done it appears to an unfathomable mystery.  Once you find out that is merely an illusion, the reality is absolutely simple and, in fact, rather mundane.  But perhaps that is why so many practitioners perpetuate the illusion.  It is not for self gain.  They just don’t want to tell the world that, when it comes to data quality, there is no Tooth Fairy, no Easter Bunny, or no Santa Claus.  It’s sad, but true.  Data quality is boringly simple!”

On Plato’s DataPeter Benson commented:

“Actually I would go substantially further, whereas data was originally no more than a representation of the real world and if validation was required the real world was the ‘authoritative source’ — but that is clearly no longer the case.  Data is in fact the new reality!

Data is now used to track everything, if the data is wrong the real world item disappears.  It may have really been destroyed or it may be simply lost, but it does not matter, if the data does not provide evidence of its existence then it does not exist.  If you doubt this, just think of money, how much you have is not based on any physical object but on data.

By the way the theoretical definition I use for data is as follows:

Datum — a disruption in a continuum.

The practical definition I use for data is as follows:

Data — elements into which information is transformed so that it can be stored or moved.”

On Data Governance and the Adjacent PossiblePaul Erb commented:

“We can see that there’s a trench between those who think adjacent means out of scope and those who think it means opportunity.  Great leaders know that good stories make for better governance for an organization that needs to adapt and evolve, but stay true to its mission. Built from, but not about, real facts, good fictions are broadly true without being specifically true, and therefore they carry well to adjacent business processes where their truths can be applied to making improvements.

On the other hand, if it weren’t for nonfiction — accounts of real markets and processes — there would be nothing for the POSSIBLE to be adjacent TO.  Managers often have trouble with this because they feel called to manage the facts, and call anything else an airy-fairy waste of time.

So a data governance program needs to assert whether its purpose is to fix the status quo only, or to fix the status quo in order to create agility to move into new areas when needed.  Each of these should have its own business case and related budgets and thresholds (tolerances) in the project plan.  And it needs to choose its sponsorship and data quality players accordingly.”

On You Say Potato and I Say Tater TotJohn O’Gorman commented:

“I’ve been working on a definitive solution for the data / information / metadata / attributes / properties knot for a while now and I think I have it figured out.

I read your blog entitled The Semantic Future of MDM and we share the same philosophy even while we differ a bit on the details.  Here goes.  It’s all information.  Good, bad, reliable or not, the argument whether data is information or vice versa is not helpful.  The reason data seems different than information is because it has too much ambiguity when it is out of context.  Data is like a quantum wave: it has many possibilities one of which is ‘collapsed’ into reality when you add context.  Metadata is not a type of data, any more than attributes, properties or associations are a type of information.  These are simply conventions to indicate the role that information is playing in a given circumstance.

Your Michelle Davis example is a good illustration: Without context, that string could be any number of individuals, so I consider it data.  Give it a unique identifier and classify it as a digital representation in the class of Person, however and we have information.  If I then have Michelle add attributes to her personal record — like sex, age, etc. — and assuming that these are likewise identified and classed — now Michelle is part of a set, or relation. Note that it is bad practice — and consequently the cause of many information management headaches — to use data instead of information.  Ambiguity kills.  Now, if I were to use Michelle’s name in a Subject Matter Expert field as proof of the validity of a digital asset; or in the Author field as an attribute, her information does not *become* metadata or an attribute: it is still information.  It is merely being used differently.

In other words, in my world while the terms ‘data’ and ‘information’ are classified as concepts, the terms ‘metadata’, ‘attribute’ and ‘property’ are classified as roles to which instances of those concepts (well, one of them anyway) can be put, i.e., they are fit for purpose.  This separation of the identity and class of the string from the purpose to which it is being assigned has produced very solid results for me.”

Thanks for giving your comments

Thank you very much for giving your comments and sharing your perspectives with our collablogaunity.  This entry in the series highlighted commendable comments on OCDQ Blog posts published between July and November of 2011.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please keep on commenting and stay tuned for future entries in the series.

Thank you for reading the Obsessive-Compulsive Data Quality (OCDQ) blog.  Your readership is deeply appreciated.

 

Related Posts

Commendable Comments (Part 10) – The 300th OCDQ Blog Post

730 Days and 264 Blog Posts Later – The Second Blogiversary of OCDQ Blog

OCDQ Blog Bicentennial – The 200th OCDQ Blog Post

Commendable Comments (Part 9)

Commendable Comments (Part 8)

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5) – The 100th OCDQ Blog Post

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

The Speed of Decision

In a previous post, I used the Large Hadron Collider as a metaphor for big data and big analytics where the creative destruction caused by high-velocity collisions of large volumes of varying data attempt to reveal elementary particles of business intelligence.

Since recent scientific experiments have sparked discussion about the possibility of exceeding the speed of light, in this blog post I examine whether it’s possible to exceed the speed of decision (i.e., the constraints that time puts on data-driven decision making).

 

Is Decision Speed more important than Data Quality?

In my blog post Thaler’s Apples and Data Quality Oranges, I explained how time-inconsistent data quality preferences within business intelligence reflect the reality that with the speed at which things change these days, more near-real-time operational business decisions are required, which sometimes makes decision speed more important than data quality.

Even though advancements in computational power, network bandwidth, parallel processing frameworks (e.g., MapReduce), scalable and distributed models (e.g., cloud computing), and other techniques (e.g., in-memory computing) are making real-time data-driven decisions more technologically possible than ever before, as I explained in my blog post Satisficing Data Quality, data-driven decision making often has to contend with the practical trade-offs between correct answers and timely answers.

Although we can’t afford to completely sacrifice data quality for faster business decisions, and obviously high quality data is preferable to poor quality data, less than perfect data quality can not be used as an excuse to delay making a critical decision.

 

Is Decision Speed more important than Decision Quality?

The increasing demand for real-time data-driven decisions is not only requiring us to re-evaluate our data quality thresholds.  In my blog post The Circle of Quality, I explained the connection between data quality and decision quality, and how result quality trumps them both because an organization’s success is measured by the quality of the business results it produces.

Again, with the speed at which the business world now changes, the reality is that the fear of making a mistake can not be used as an excuse to delay making a critical decision, which sometimes makes decision speed more important than decision quality.

“Fail faster” has long been hailed as the mantra of business innovation.  It’s not because failure is a laudable business goal, but instead because the faster you can identify your mistakes, the faster you can correct your mistakes.  Of course this requires that you are actually willing to admit you made a mistake.

(As an aside, I often wonder what’s more difficult for an organization to admit: poor data quality or poor decision quality?)

Although good decisions are obviously preferable to bad decisions, we have to acknowledge the fragility of our knowledge and accept that mistake-driven learning is an essential element of efficient and effective data-driven decision making.

Although the speed of decision is not the same type of constant as the speed of light, in our constantly changing business world, the speed of decision represents the constant demand for good-enough data for fast-enough decisions.

 

Related Posts

The Big Data Collider

A Decision Needle in a Data Haystack

The Data-Decision Symphony

Thaler’s Apples and Data Quality Oranges

Satisficing Data Quality

Data Confabulation in Business Intelligence

The Data that Supported the Decision

Data Psychedelicatessen

OCDQ Radio - Big Data and Big Analytics

OCDQ Radio - Good-Enough Data for Fast-Enough Decisions

Data, Information, and Knowledge Management

Data In, Decision Out

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

The Circle of Quality

The Three Most Important Letters in Data Governance

Three+Letters+Before.jpg

In his book I Is an Other: The Secret Life of Metaphor and How It Shapes the Way We See the World, James Geary included several examples of the psychological concept of priming.  “Our metaphors prime how we think and act.  This kind of associative priming goes on all the time.  In one study, researchers showed participants pictures of objects characteristic of a business setting: briefcases, boardroom tables, a fountain pen, men’s and women’s suits.  Another group saw pictures of objects—a kite, sheet music, a toothbrush, a telephone—not characteristic of any particular setting.”

“Both groups then had to interpret an ambiguous social situation, which could be described in several different ways.  Those primed by pictures of business-related objects consistently interpreted the situation as more competitive than those who looked at pictures of kites and toothbrushes.”

“This group’s competitive frame of mind asserted itself in a word completion task as well.  Asked to complete fragments such as wa_, _ight, and co_p__tive, the business primes produced words like war, fight, and competitive more often than the control group, eschewing equally plausible alternatives like was, light, and cooperative.”

Communication, collaboration, and change management are arguably the three most critical aspects for implementing a new data governance program successfully.  Since all three aspects are people-centric, we should pay careful attention to how we are priming people to think and act within the context of data governance principles, policies, and procedures.  We could simplify this down to whether we are fostering an environment that primes people for cooperation—or primes people for competition.

Since there are only three letters of difference between the words cooperative and competitive, we could say that these are the three most important letters in data governance.

Three+Letters+After.jpg

The Big Data Collider

As I mentioned in a previous post, I am reading the book Where Good Ideas Come From by Steven Johnson, which examines recurring patterns in the history of innovation.  The current chapter that I am reading is dispelling the traditional notion of the eureka effect by explaining that the evolution of ideas, like all evolution, stumbles its way toward the next good idea, which inevitably, and not immediately, leads to a significant breakthrough.

One example is how the encyclopedic book Enquire Within Upon Everything, the first edition of which was published in 1856, influenced a young British scientist, who in his childhood in the 1960s was drawn to the “suggestion of magic in the book’s title, and who spent hours exploring this portal to the world of information, along with the wondrous feeling of exploring an immense trove of data.”  His childhood fascination with data and information influenced a personal project that he started in 1980, which ten years later became a professional project while he has working in the Swiss particle physics lab CERN.

The scientist was Tim Berners-Lee and his now famous project created the World Wide Web.

“Journalists always ask me,” Berners-Lee explained, “what the crucial idea was, or what the singular event was, that allowed the Web to exist one day when it hadn’t the day before.  They are frustrated when I tell them there was no eureka moment.”

“Inventing the World Wide Web involved my growing realization that there was a power in arranging ideas in an unconstrained, web-like way.  And that awareness came to me through precisely that kind of process.”

CERN is famous for its Large Hadron Collider that uses high-velocity particle collisions to explore some of the open questions in physics concerning the basic laws governing the interactions and forces among elementary particles in an attempt to understand the deep structure of space and time, and, in particular, the intersection of quantum mechanics and general relativity.

 

The Big Data Collider

While reading this chapter, I stumbled toward an idea about Big Data, which as Gartner Research explains, although the term acknowledges the exponential growth, availability, and use of information in today’s data-rich landscape, it’s about more than just data volume.  Data variety (i.e., structured, semi-structured, and unstructured data, as well as other types of data such as sensor data) and data velocity (i.e., how fast data is being produced and how fast the data must be processed to meet demand) are also key characteristics of Big Data.

David Loshin’s recent blog post about Hadoop and Big Data provides a straightforward explanation and simple example of using MapReduce for not only processing fast-moving large volumes of various data, but also deriving meaningful insights from it.

My idea was how Big Analytics uses the Big Data Collider to allow large volumes of various data particles to bounce off each other in high-velocity collisions.  Although a common criticism of Big Data is that it contains more noise than signal, smashing data particles together in the Big Data Collider may destroy most of the noise in the collision, allowing the signals that survive that creative destruction to potentially reduce into an elementary particle of business intelligence.

Admittedly not the greatest metaphor, but as we enquire within data about everything in the Information Age, I thought that it might be useful to share my idea so that it might stumble its way toward the next good idea by colliding with an idea of your own.

 

Related Posts

OCDQ Radio - Big Data and Big Analytics

OCDQ Radio - Good-Enough Data for Fast-Enough Decisions

OCDQ Radio - A Brave New Data World

Data, Information, and Knowledge Management

Thaler’s Apples and Data Quality Oranges

Data Confabulation in Business Intelligence

Data In, Decision Out

The Data-Decision Symphony

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

Beyond a “Single Version of the Truth”

The General Theory of Data Quality

The Data-Information Continuum

Schrödinger’s Data Quality

Data Governance and the Buttered Cat Paradox

Data Governance and the Adjacent Possible

I am reading the book Where Good Ideas Come From by Steven Johnson, which examines recurring patterns in the history of innovation.  The first pattern Johnson writes about is called the Adjacent Possible, which is a term coined by Stuart Kauffman, and is described as “a kind of shadow future, hovering on the edges of the present state of things, a map of all the ways in which the present can reinvent itself.  Yet it is not an infinite space, or a totally open playing field.  The strange and beautiful truth about the adjacent possible is that its boundaries grow as you explore those boundaries.”

Exploring the adjacent possible is like exploring “a house that magically expands with each door you open.  You begin in a room with four doors, each leading to a new room that you haven’t visited yet.  Those four rooms are the adjacent possible.  But once you open any one of those doors and stroll into that room, three new doors appear, each leading to a brand-new room that you couldn’t have reached from your original starting point.  Keep opening new doors and eventually you’ll have built a palace.”

If it ain’t broke, bricolage it

“If it ain’t broke, don’t fix it” is a common defense of the status quo, which often encourages an environment that stifles innovation and the acceptance of new ideas.  The status quo is like staying in the same familiar and comfortable room and choosing to keep all four of its doors closed.

The change management efforts of data governance often don’t talk about opening one of those existing doors.  Instead they often broadcast the counter-productive message that “everything is so broken, we can’t fix it.”  We need to destroy our existing house and rebuild it from scratch with brand new rooms — and probably with one of those open floor plans without any doors.

Should it really be surprising when this approach to change management is so strongly resisted?

The term bricolage can be defined as making creative and resourceful use of whatever materials are at hand regardless of their original purpose, stringing old parts together to form something radically new, transforming the present into the near future.

“Good ideas are not conjured out of thin air,” explains Johnson, “they are built out of a collection of existing parts.”

The primary reason that the change management efforts of data governance are resisted is because they rely almost exclusively on negative methods—they emphasize broken business and technical processes, as well as bad data-related employee behaviors.

Although these problems exist and are the root cause of some of the organization’s failures, there are also unheralded processes and employees that prevented other problems from happening, which are the root cause of some of the organization’s successes.

It’s important to demonstrate that some data governance policies reflect existing best practices, which helps reduce resistance to change, and so a far more productive change management mantra for data governance is: “If it ain’t broke, bricolage it.”

Data Governance and the Adjacent Possible

As Johnson explains, “in our work lives, in our creative pursuits, in the organizations that employ us, in the communities we inhabit—in all these different environments, we are surrounded by potential new ways of breaking out of our standard routines.”

“The trick is to figure out ways to explore the edges of possibility that surround you.”

Most data governance maturity models describe an organization’s evolution through a series of stages intended to measure its capability and maturity, tendency toward being reactive or proactive, and inclination to be project-oriented or program-oriented.

Johnson suggests that “one way to think about the path of evolution is as a continual exploration of the adjacent possible.”

Perhaps we need to think about the path of data governance evolution as a continual exploration of the adjacent possible, as a never-ending journey which begins by opening that first door, building a palatial data governance program one room at a time.

 

Related Posts

Turning Data Silos into Glass Houses

Although data silos are denounced as inherently bad since they complicate the coordination of enterprise-wide business activities, since they are often used to support some of those business activities, whether or not data silos are good or bad is a matter of perspective.  For example, data silos are bad when different business units are redundantly storing and maintaining their own private copies of the same data, but data silos are good when they are used to protect sensitive data that should not be shared.

Providing the organization with a single system of record, a single version of the truth, a single view, a golden copy, or a consolidated repository of trusted data has long been the anti-data-silo siren song of enterprise data warehousing (EDW), and more recently, of master data management (MDM).  Although these initiatives can provide significant business value, somewhat ironically, many data silos start with EDW or MDM data that was replicated and customized in order to satisfy the particular needs of an operational project or tactical initiative.  This customized data either becomes obsolesced after the conclusion of its project or initiative — or it continues to be used because it is satisfying a business need that EDW and MDM are not.

One of the early goals of a new data governance program should be to provide the organization with a substantially improved view of how it is using its data — including data silos — to support its operational, tactical, and strategic business activities.

Data governance can help the organization catalog existing data sources, build a matrix of data usage and related business processes and technology, identify potential external reference sources to use for data enrichment, as well as help define the metrics that meaningfully measure data quality using business-relevant terminology.

The transparency provided by this combined analysis of the existing data, business, and technology landscape will provide a more comprehensive overview of enterprise data management problems, which will help the organization better evaluate any existing data and technology re-use and redundancies, as well as whether investing in new technology will be necessary.

Data governance can help topple data silos by first turning them into glass houses through transparency, empowering the organization to start throwing stones at those glass houses that must be eliminated.  And when data silos are allowed to persist, they should remain glass houses, clearly illustrating whether or not they have the business-justified reasons for continued use.

 

Related Posts

Data and Process Transparency

The Good Data

The Data Outhouse

Time Silos

Sharing Data

Single Version of the Truth

Beyond a “Single Version of the Truth”

The Quest for the Golden Copy

The Idea of Order in Data

Hell is other people’s data

Information Quality Certified Professional

Information Quality Certified Professional (IQCP) is the new certification program from the IAIDQ.  The application deadline for the next certification exam is October 25, 2011.  For more information about IQCP certification, please refer to the following links:

 

Taking the first IQCP exam

A Guest Post written by Gordon Hamilton

I can still remember how galvanized I was by the first email mentions of the IQCP certification and its inaugural examination.  I’d been a member of the IAIDQ for the past year and I saw the first mailings in early February 2011.  It’s funny but my memory of the sequence of events was that I filled out the application for the examination that first night, but going back through my emails I see that I attended several IAIDQ Webinars and followed quite a few discussions on LinkedIn before I finally applied and paid for the exam in mid-March (I still got the early bird discount).

Looking back now, I am wondering why I was so excited about the chance to become certified in data quality.  I know that I had been considering the CBIP and CBAP, from TDWI and IIBA respectively, for more than a year, going so far as to purchase study materials and take some sample exams.  Both the CBIP and CBAP designations fit where my career had been for 20+ years, but the subject areas were now tangential to my focus on information and data quality.

The IQCP certification fit exactly where I hoped my career trajectory was now taking me, so it really did galvanize me to action.

I had been a software and database developer for 20+ years when I caught a bad case of Deming-god worship while contracting at Microsoft in the early 2000s, and it only got worse as I started reading books by Olson, Redman, English, Loshin, John Morris, and Maydanchik on how data quality dovetailed with development methodologies of folks like Kimball and Inmon, which in turn dovetailed with the Lean Six Sigma methods.  I was on the slippery slope to choosing data quality as a career because those gurus of Data Quality, and Quality in general, were explaining, and I was finally starting to understand, why data warehouse projects failed so often, and why the business was often underwhelmed by the information product.

I had 3+ months to study and the resource center on the IAIDQ website had a list of recommended books and articles.  I finally had to live up to my moniker on Twitter of DQStudent.  I already had many of the books recommended by IAIDQ at home but hadn’t read them all yet, so while I waited for Amazon and AbeBooks to send me the books I thought were crucial, I began reading Deming, English, and Loshin.

Of all the books that began arriving on my doorstep, the most memorable was Journey to Data Quality by Richard Wang et al.

That book created a powerful image in my head of the information product “manufactured” by every organization.  That image of the “information product” made the suggestions by the data quality gurus much clearer.  They were showing how to apply quality techniques to the manufacture of Business Intelligence.  The image gave me a framework upon which to hang the other knowledge I was gathering about data quality, so it was easier to keep pushing through the books and articles because each new piece could fit somewhere in that manufacturing process.

I slept well the night before the exam, and gave myself plenty of time to make it to the Castle exam site that afternoon.  I took along several books on data quality, but hardly glanced at them.  Instead I grabbed a quick lunch and then a strong coffee to carry me through the 3 hour exam.  At 50 questions per hour I was very conscious of how long each question was taking me and every 10 questions or so I would check to see if was going to run into time trouble.  It was obvious after 20 questions that I had plenty of time so I began to get into a groove, finishing the exam 30 minutes early, leaving plenty of time to review any questionable answers.

I found the exam eminently fair with no tricky question constructions at all, so I didn’t seem to fall into the over-thinking trap that I sometimes do.  Even better, the exam wasn’t the type that drilled deeper and deeper into my knowledge gaps when I missed a question.  Even though I felt confident that I had passed, I’ve got to tell you that the 6 weeks that the IAIDQ took to determine the passing threshold on this inaugural exam and send out passing notifications were the longest 6 weeks I have spent for a long time.  Now that the passing mark is established, they swear that the notifications will be sent out much faster.

I still feel a warm glow as I think back on achieving IQCP certification.  I am proud to say that I am a data quality consultant and I have the certificate proving the depth and breadth of my knowledge.

Gordon Hamilton is a Data Quality, Data Warehouse, and IQCP certified professional, whose 30 years’ experience in the information business encompasses many industries, including government, legal, healthcare, insurance and financial.

 

Related Posts

Studying Data Quality

The Blue Box of Information Quality

Data, Information, and Knowledge Management

Are you turning Ugly Data into Cute Information?

The Dichotomy Paradox, Data Quality and Zero Defects

The Data Quality Wager

Aristotle, Data Governance, and Lead Rulers

Data governance requires the coordination of a complex combination of a myriad of factors, including executive sponsorship, funding, decision rights, arbitration of conflicting priorities, policy definition, policy implementation, data quality remediation, data stewardship, business process optimization, technology enablement, and, perhaps most notably, policy enforcement.

But sometimes this emphasis on enforcing policies makes data governance sound like it’s all about rules.

In their book Practical Wisdom, Barry Schwartz and Kenneth Sharpe use the Nicomachean Ethics of Aristotle as a guide to explain that although rules are important, what is more important is “knowing the proper thing to aim at in any practice, wanting to aim at it, having the skill to figure out how to achieve it in a particular context, and then doing it.”

Aristotle observed the practical wisdom of the craftsmen of his day, including carpenters, shoemakers, blacksmiths, and masons, noting how “their work was not governed by systematically applying rules or following rigid procedures.  The materials they worked with were too irregular, and each task posed new problems.”

“Aristotle was particularly fascinated with how masons used rulers.  A normal straight-edge ruler was of little use to the masons who were carving round columns from slabs of stone and needed to measure the circumference of the columns.”

Unless you bend the ruler.

“Which is exactly what the masons did.  They fashioned a flexible ruler out of lead, a forerunner of today’s tape measure.  For Aristotle, knowing how to bend the rule to fit the circumstance was exactly what practical wisdom was all about.”

Although there’s a tendency to ignore the existing practical wisdom of the organization, successful data governance is not about systematically applying rules or following rigid procedures, and precisely because the dynamic challenges faced, and overcome daily, by business analysts, data stewards, technical architects, and others, exemplify today’s constantly changing business world.

But this doesn’t mean that effective data governance policies can’t be implemented.  It simply means that instead of focusing on who should lead the way (i.e., top-down or bottom-up), we should focus on what the rules of data governance are made of.

Well-constructed data governance policies are like lead rulers—flexible rules that empower us with an understanding of the principle of the policy, and trust us to figure out how best to enforce the policy in a particular context, how to bend the rule to fit the circumstance.  Aristotle knew this was exactly what practical wisdom was all about—data governance needs practical wisdom.

“Tighter rules and regulations, however necessary, are pale substitutes for wisdom,” concluded Schwartz and Sharpe.  “We need rules to protect us from disaster.  But at the same time, rules without wisdom are blind and at best guarantee mediocrity.”

The Fall Back Recap Show

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode, I celebrate the autumnal equinox by falling back to look at the Best of OCDQ Radio, including discussions about Data, Information, Business-IT Collaboration, Change Management, Big Analytics, Data Governance, and the Data Revolution.

Thank you for listening to OCDQ Radio.  Your listenership is deeply appreciated.

Special thanks to all OCDQ Radio guests.  If you missed any of their great appearances, check out the full episode list below.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

The Blue Box of Information Quality

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode, Daragh O Brien and I discuss the Blue Box of Information Quality, which is much bigger on the inside, as well as using stories as an analytical tool and change management technique, and why we must never forget that “people are cool.”

Daragh O Brien is one of Ireland’s leading Information Quality and Governance practitioners.  After being born at a young age, Daragh has amassed a wealth of experience in quality information driven business change, from CRM Single View of Customer to Regulatory Compliance, to Governance and the taming of information assets to benefit the bottom line, manage risk, and ensure customer satisfaction.  Daragh O Brien is the Managing Director of Castlebridge Associates, one of Ireland’s leading consulting and training companies in the information quality and information governance space.

Daragh O Brien is a founding member and former Director of Publicity for the IAIDQ, which he is still actively involved with.  He was a member of the team that helped develop the Information Quality Certified Professional (IQCP) certification and he recently became the first person in Ireland to achieve this prestigious certification.

In 2008, Daragh O Brien was awarded a Fellowship of the Irish Computer Society for his work in developing and promoting standards of professionalism in Information Management and Governance.

Daragh O Brien is a regular conference presenter, trainer, blogger, and author with two industry reports published by Ark Group, the most recent of which is The Data Strategy and Governance Toolkit.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Good-Enough Data for Fast-Enough Decisions

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

On this episode, Julie Hunt and I discuss the intersection of data quality and business intelligence, especially the strategy of good-enough data for fast-enough decisions, a necessity for surviving and thriving in the constantly changing business world.

Julie Hunt is an accomplished software industry analyst and business technology strategist, providing market and competitive insights for software vendors.  Julie Hunt has the unique perspective of a hybrid, which means she has extensive experience in the technology, business, and customer/people-oriented aspects of creating, marketing and selling software.  Working in the B2B software industry for more than 25 years, she has hands-on experience for multiple solution spaces including data integration, business intelligence, analytics, content management, and collaboration.  She is also a member of the Boulder BI Brain Trust.

Julie Hunt regularly shares her insights about the software industry on Twitter as well as via her highly recommended blog.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

DQ-Tip: “The quality of information is directly related to...”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“The quality of information is directly related to the value it produces in its application.”

This DQ-Tip is from the excellent book Entity Resolution and Information Quality by John Talburt.

The relationship between data and information, and by extension data quality and information quality, is acknowledged and explored in the book’s second chapter, which includes a brief history of information theory, as well as the origins of many of the phrases frequently used throughout the data/information quality industry, e.g., fitness for use and information product.

Talburt explains that the problem with the fitness-for-use definition for the quality of an information product (IP) is that it “assumes that the expectations of an IP user and the value produced by the IP in its application are both well understood.”

Different users often have different applications for data and information, requiring possibly different versions of the IP, each with a different relative value to the user.  This is why Talburt believes that the quality of information is best defined, not as fitness for use, but instead as the degree to which the information creates value for a user in a particular application.  This allows us to measure the business-driven value of information quality with technology-enabled metrics, which are truly relevant to users.

Talburt believes that casting information quality in terms of business value is essential to gaining management’s endorsement of information quality practices within an organizaiton, and Talburt recommends three keys to success with information quality:

  1. Always relate information quality to business value
  2. Give stakeholders a way to talk about information quality—the vocabulary and concepts
  3. Show them a way to get started on improving information quality—and a vision for sustaining it

 

Related Posts

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

The Fourth Law of Data Quality

The Role of Data Quality Monitoring in Data Governance

Data Quality Measurement Matters

Studying Data Quality

DQ-Tip: “Undisputable fact about the value and use of data...”

DQ-Tip: “Data quality tools do not solve data quality problems...”

DQ-Tip: “There is no such thing as data accuracy...”

DQ-Tip: “Data quality is primarily about context not accuracy...”

DQ-Tip: “There is no point in monitoring data quality...”

DQ-Tip: “Don't pass bad data on to the next person...”

DQ-Tip: “...Go talk with the people using the data”

DQ-Tip: “Data quality is about more than just improving your data...”

DQ-Tip: “Start where you are...”