Jim Harris

My name is Jim Harris, I am the Blogger-in-Chief of OCDQ Blog, and an independent consultant, speaker, and freelance writer for hire.

My Services Contact Me
Search OCDQ Blog
Recent Comments
« Data In, Decision Out | Main | Data and Process Transparency »
Thursday
Jan272011

The Asymptote of Data Quality

In analytic geometry (according to Wikipedia), an asymptote of a curve is a line such that the distance between the curve and the line approaches zero as they tend to infinity.  The inspiration for my hand-drawn illustration was a similar one (not related to data quality) in the excellent book Linchpin: Are You Indispensable? by Seth Godin, which describes an asymptote as:

“A line that gets closer and closer and closer to perfection, but never quite touches.”

“As you get closer to perfection,” Godin explains, “it gets more and more difficult to improve, and the market values the improvements a little bit less.  Increasing your free-throw percentage from 98 to 99 percent may rank you better in the record books, but it won’t win any more games, and the last 1 percent takes almost as long to achieve as the first 98 percent did.”

The pursuit of data perfection is a common debate in data quality circles, where it is usually known by the motto:

“The data will always be entered right, the first time, every time.”

However, Henrik Liliendahl Sørensen has cautioned that even when this ideal can be achieved, we must still acknowledge the inconvenient truth that things change, and Evan Levy has reminded us that data quality isn’t the same as data perfection, and David Loshin has used the Pareto principle to describe the point of diminishing returns in data quality improvements.

Chasing data perfection can be a powerful motivation, but it can also undermine the best of intentions.  Not only is it important to accept that the Asymptote of Data Quality can never be reached, but we must realize that data perfection was never the goal.

The goal is data-driven solutions for business problems—and these dynamic problems rarely have (or require) a perfect solution.

Data quality practitioners must strive for continuous data quality improvement, but always within the business context of data, and without losing themselves in the pursuit of a data-myopic ideal such as data perfection.

 

Related Posts

To Our Data Perfectionists

The Data-Decision Symphony

Is your data complete and accurate, but useless to your business?

Finding Data Quality

MacGyver: Data Governance and Duct Tape

You Can’t Always Get the Data You Want

What going to the dentist taught me about data quality

A Tale of Two Q’s

Data Quality and The Middle Way

Hyperactive Data Quality (Second Edition)

Missed It By That Much

The Data Quality Goldilocks Zone

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (6)

Agreed...I always picture a glowing Grail floating atop the next castle...and I'm always running into rude Frenchmen...:-D

January 26, 2011 | Unregistered CommenterSteve Putman

Too funny, Jim. I was going to write a similar post for the Data Roundtable after reading Daniel Pink's book Drive.

I had a déjà vu moment reading this post.

January 27, 2011 | Unregistered CommenterPhil Simon

Thanks for your comments, Steve and Phil.

@Steve — Yes, we must resist the Grail-shaped beacon that would lead us to Castle Asymptote and be as vigilant as the Knights who say No to Data Perfection :-)

@Phil — Yes, I was tempted to also include a reference to Daniel Pink’s book since he explains the asymptote dilemma well:

“This is the nature of mastery: Mastery is an asymptote.

You can approach it. You can home in on it. You can get really, really, really close to it. But you can never touch it. Mastery is impossible to realize fully.

The mastery asymptote is a source of frustration. Why reach for something you can never fully attain? But it’s also a source of allure. Why not reach for it? The joy is in the pursuit more than the realization. In the end, mastery attracts precisely because mastery eludes.”

January 27, 2011 | Registered CommenterJim Harris

From the LinkedIn Group for the IAIDQ Open Community, Richard Ordowich commented:

“What are missing on the chart are the relative scales.

The days of effort should be months and perhaps years in order to fit on a chart.

The data quality scale should have the following data points:

Low hanging fruit
Reactive Data Quality
Business process root causes
System and application root causes
Business policy root causes

The volume of errors discovered and corrected will be larger at the outset due to the low hanging fruit syndrome and a reactive approach to quality. Once these errors are addressed, the data quality effort will reach a point of diminishing returns. Once it is discovered that the remaining errors require changes to business policies, processes and systems, the time scale gets larger and data quality improvement results plateau.

It’s not data quality perfection we are trying to achieve but business impact. With data quality, or for that matter any organizational change initiative, the greater the impact on the business the more difficult it is to achieve.”


And I responded:

Excellent points, Richard.

Days was definitely a poor choice for the chart's time increment, and your recommended data points would have been much better than my generic ones.

We are definitely in agreement that business impact (and not data perfection) is the goal we are trying to achieve.

Best Regards,

Jim

January 27, 2011 | Registered CommenterJim Harris

From the LinkedIn Group for the Data Cleansing User Group, Martin Doyle commented:

“Isn't that why data quality is about fitness for purpose, not perfection.

Correcting 100 things by 1% will generally be better than correcting one thing by 100%.”

And I responded:

Yes, I definitely agree that data quality is about fitness for (business) purpose, and not (data) perfection.

However, I frequently witness data myopia among data quality practitioners, meaning that they are hyper-focused on improving the data in isolation of any business use, as if they believe that perfect data is not only somehow possible, but that perfect data will somehow be able to satisfy any possible business requirement.

January 28, 2011 | Registered CommenterJim Harris

From the LinkedIn Group for Data Governance & Stewardship, Mike Wheeler commented:

“In my blog post Don’t Get Caught in Data Governance Traps, I wrote about the classic missteps of chasing quality for the sake of quality. Perhaps Lexus can afford the "relentless pursuit of perfection", but the rest of us need a healthy dose of reality to ensure that we are maximizing the benefit-cost ratio for our business. Without considering the process within which data is being used, you cannot accurately assess its value and therefore unable to set proper quality thresholds.”

February 3, 2011 | Registered CommenterJim Harris

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>