Photo via Flickr by: Leo Reynolds
Like truth, beauty, and singing ability, data quality is in the eyes of the beholder.
Data’s quality is determined by evaluating its fitness for the purpose of use. However, in the vast majority of cases, data has multiple uses, and data of sufficient quality for one use may not be of sufficient quality for other uses.
Therefore, to be more accurate, data quality is in the eyes of the user.
The perspective of the user provides a relative context for data quality. Many argue an absolute context for data quality exists, one which is independent of the often conflicting perspectives of different users.
This absolute context is often referred to as a “Single Version of the Truth.”
As one example of the challenges inherent in this data quality key concept, let’s consider if there is a “Single Version of the Time.”
Single Version of the Time
I am writing this blog post at 10:00 AM. I am using time in a relative context, meaning that from my perspective it is 10 o’clock in the morning. I live in the Central Standard time zone (CST) of the United States.
My friend in Europe would say that I am writing this blog post at 5:00 PM. He is also using time in a relative context, meaning that from his perspective it is 5 o’clock in the afternoon. My friend lives in the Central European time zone (CET).
We could argue that an absolute time exists, as defined by Coordinated Universal Time (UTC). Local times around the world can be expressed as a relative time using positive or negative offsets from UTC. For example, my relative time is UTC-6 and my friend’s relative time is UTC+1. Alternatively, we could use absolute time and say that I am writing this blog post at 16:00 UTC.
Although using an absolute time is an absolute necessity if, for example, my friend and I wanted to schedule a time to have a telephone (or Skype) discussion, it would be confusing to use UTC when referring to events relative to our local time zone.
In other words, the relative context of the user’s perspective is valid and an absolute context independent of the perspectives of different users is also valid—especially whenever a shared perspective is necessary in order to facilitate dialogue and discussion.
Therefore, instead of calling UTC a Single Version of the Time, we could call it a Shared Version of the Time and when it comes to the data quality concept of a Single Version of the Truth, perhaps it’s time we started calling it a Shared Version of the Truth.