The Best Data Quality Blog Posts of 2010

This year-end review provides summaries of and links to The Best Data Quality Blog Posts of 2010. Please note the following:

For simplicity, “Data Quality” also includes Data Governance, Master Data Management, and Business Intelligence
Intentionally excluded from consideration were my best blog posts of the year — not counting that shameless plug :-)
The Data Roundtable was also excluded since I already published a series about its best 2010 blog posts (see links below)
Selection was based on a pseudo-scientific, quasi-statistical, and proprietary algorithm (i.e., I just picked the ones I liked)
Ordering is based on a pseudo-scientific, quasi-statistical, and proprietary algorithm (i.e., no particular order whatsoever)

The Best Data Quality Blog Posts of 2010

The Quality Gap: Why Being On-Time Isn’t Enough by Jill Dyché – Discusses the all-too-common tendency to emphasize efficiency over effectiveness in enterprise project management, where everything is date-driven and not quality-driven.

What is Data Quality anyway? by Henrik Liliendahl Sørensen – Asks two excellent questions about data quality (which received great comments): “Is data quality an independent discipline?” and “Is data quality an independent technology?”

Data Quality is a DATA issue by Graham Rhind – Expounds on the common discussion about whether data quality is a business issue or a technical issue by explaining that although it can sometimes be either or both, it’s always a data issue.

What’s the Root Cause of Bad Data? by Winston Chen of Kalido – Begins an eight-part series about data governance by examining the lack of transparency and accountability, which is quite often the root cause of poor enterprise data quality.

Who should be accountable for data quality? by Peter Thomas – Discusses how improving data quality is going to require a cross-functional approach to achieve its goals, and generated an excellent comment discussion about accountability.

Bad word?: Data Owner by Henrik Liliendahl Sørensen – Examines how the common data quality terms “data owner” and “data ownership” are used, whether they are truly useful, and generated an excellent comment discussion about ownership.

The Data Zoo by Julian Schwarzenbach – Concludes a five-part series about how user behaviors affect information quality, and also the basis for an excellent whitepaper available for download, no registration required: The Data Zoo White Paper, and after the whitepaper was published, two additional species were added to the Data Zoo: Not Bovvered and The PoD.

Data quality challenges: behavioral inertia and its evil opposite by James Standen – Explains how we need to work with our organization to foster an environment where we value process, and consistency, but understand that a steady, relentless change to optimize is needed, and is valuable—even without perfect cooperation at all times from everyone.

Predictably Poor MetaData Quality by Beth Breidenbach – Examines whether data quality and metadata quality issues stem from the same root source—human behavior, which is also the solution to these issues since technology doesn’t cause or solve these challenges, but rather, it’s a tool that exacerbates or aids human behavior in either direction.

How Are You Creating a Pull for Data Quality in Your Organization? by Dylan Jones – Provides two brief case studies contrasting the “push” and “pull” approaches to getting the organization engaged in a data quality improvement initiative.

Managing Data Quality in the Face of Organizational Change by David Loshin – Examines the need to plan for the contingency that there will be changes to the organization that may have unwanted impacts on a data quality program.

Change Management and Data Governance by Steve Sarsfield – Discusses the application of the ADKAR (Awareness, Desire, Knowledge, Ability, Reinforcement) model to the change management efforts of a data governance program.

Do your Data Quality Heroes know who they are? by Julian Schwarzenbach – Discusses the importance of recognizing your heroes—the people within your organization who have a more beneficial impact on data quality than their peers.

WANTED: Data Quality Change Agents by Dylan Jones – Explains the key traits required of all data quality change agents, including a positive attitude, a willingness to ask questions, innovation advocating, and persuasive evangelism.

The tragedy of anti-data leadership and dataphobia by James Standen – Explains that companies who have leaders that understand the importance of data — and data analysis — are going to be running circles around the companies that don’t.

Profound Profiling by Daragh O Brien – Discusses the profound business benefits of data profiling for organizations seeking to manage risk and ensure compliance, including the sage data and information quality advice: “Profile early, profile often.”

A balanced approach to scoring data quality by Phil Wright – Begins an excellent six-part series about building an effective data quality scorecard, which can be used as a vehicle to promote continuous data quality improvement.

When it comes to data quality, it’s the last mile that counts by Michele Goetz on Trillium Software Insights – Explains that data quality is not just about satisfying executive level reporting and analysis, but operational data quality (ODQ) services must be deployed at the point of data entry, and within and between business applications to facilitate effective data usage.

The Importance of Scope in Data Quality Efforts by Jill Dyché – Illustrates five levels of delivery that can help you quickly establish the boundaries of your initial data quality project, which will enable you to implement an incremental approach for your sustained data quality program that will build momentum to larger success over time.

One small step for IT, one giant step for Data Quality by Jill Wanless – Shares a great diagram useful for getting buy-in from IT resources to start data management activities at the start of every project, helping make data quality a top priority.

The Myth about a Myth by Henrik Liliendahl Sørensen – Debunks the myth that data quality (and a lot of other things) is all about technology — and it’s certainly no myth that this blog post generated a lengthy discussion in the comments section.

New ERP Hated by All . . . Because No One Cleaned the Data by Peter Aynsley-Hartwell on the Utopia Blog – Discusses the problems caused when no one cleaned the data from the old system(s) before it was loaded into the new shiny ERP system.

Does Enterprise Data Quality need to push the boundary more? by William Sharp – Examines the need for data matching validation of true negative results, i.e., validating that those records not identified as duplicates are truly not duplicates.

Referential Treatment - The Open Source Reference Data Trend by Steve Sarsfield – Examines the growing industry trend of open source reference data and its positive impact on the future of data quality and data enrichment processes.

Definition drift by Graham Rhind – Examines the persistent problems facing attempts to define a consistent terminology within the data quality industry for concepts such as validity versus accuracy, and currency versus timeliness.

Solvency II Standards for Data Quality – Common sense standards for all businesses by Ken O’Connor – Begins a great four-part series which explains how the Solvency II standards are common sense data quality standards, which can enable all organizations, regardless of their industry or region, to achieve complete, appropriate, and accurate data.

Data Quality: A Philosophical Approach to Truth by Beth Breidenbach – Examines how the background, history, and perceptions we bring to a situation, any situation, will impact what we perceive as “truth” in that moment, and we don’t have to agree with another’s point of view, but we should at least make an attempt to understand the logic behind it.

Falsehoods Programmers Believe About Names by Patrick McKenzie – Provides an entertaining list of 40 false assumptions about person names—a data domain that has always been (and likely always will be) a very daunting data quality challenge.

Knowledge is the Key to International Data Quality by Holger Wandt of Human Inference – Provides some great examples of a few of the challenges facing data quality tools and business processes when dealing with international data.

What Are Master Data? by Marty Moseley of IBM Initiate – Defines the differences between reference data and master data, providing examples of each, and, not surprisingly, this blog post also sparked an excellent discussion within its comments.

Start with the Need, Pain or Problem (Not “The Solution”) by Dan Power – Begins an excellent ten-part series about some of the best practices for successfully implementing master data management (MDM) and data governance.

Is There a Single Version of the Truth? by Robin Bloor on Data Integration Blog – Explores that business intelligence cliché and how master data management (MDM) really seeks to deliver “an unambiguous and data rich version of the truth.”

The Relationship Between MDM and Data Governance by Marty Moseley of IBM Initiate – Examines the relationship between the two most blogged about data industry topics of 2010: master data management (MDM) and data governance.

Data Governance Remains Immature by Rob Karel – Examines the results of several data governance surveys and explains how there is a growing recognition that data governance is not — and should never have been — about the data.

Business Process Management meets Data Quality Management by Ed Wrazen on Trillium Software Insights – Examines whether data governance is the by-product of the colliding worlds of business process management (BPM) and data quality.

Business Rules and Data Rules: What's the Difference? by David Loshin – Discusses an important aspect of data quality management, especially in regards to data governance, examining the relationship between business rules and data rules.

Back to the Future of Data Governance by Marty Moseley of IBM Initiate – Provides links to a superb seven-part series about data governance, which covers critically important concepts such as principles, policies, business rules, and metrics.

The Future – Agile Data-Driven Enterprises by John Schmidt on Informatica Perspectives – Concludes a seven-part series about data as an asset, which examines how successful organizations manage their data as a strategic asset, ensuring that relevant, trusted data can be delivered quickly when, where and how needed to support the changing needs of the business.

A Contrarian’s View of Agile BI and Standing Up for Agile BI (Sort of) by Jill Dyché – These two blog posts are required reading about the utilization of agile methodologies in business intelligence, since, as always, Jill tells it like it really is.

The role of a Business Analyst in the Management of Information by Jill Wanless – Examines how business analysts can help the organization with the management of its information as a corporate asset, e.g., linking data to business objectives.

Data as an Asset by David Pratt – The one where a new guy in the data blogosphere (his blog launched in November 2010) explains treating data as an asset is all about actively doing things to improve both the quality and usefulness of the data.

Putting your best foot forward: Data Quality Best Practices by William Sharp – Lists 10 essential best practices, which are rightly more focused on establishing a data quality program rather than the remediation efforts within the program.

PLEASE NOTE: No offense is intended to any of the great 2010 data quality blog posts not listed above. However, if you feel that I have made a glaring omission, then please feel free to post a comment below and add it to the list. Thanks!

I hope that everyone had a great 2010 and I look forward to seeing all of you around the data quality blogosphere in 2011.

The 2010 Data Quality Blogging All-Stars

Recently Read: May 15, 2010

Recently Read: March 22, 2010

Recently Read: March 6, 2010

Recently Read: January 23, 2010

Additional Resources

From the IAIDQ, read the 2010 issues of the Blog Carnival for Information/Data Quality:

From the Data Roundtable, read the 2010 quarterly review blog series:

OCDQ Blog

OCDQ Blog

OCDQ Blog

OCDQ Blog

The Best Data Quality Blog Posts of 2010