One thread of the comment discussion on my blog post The Metadata Continuum raised the excellent point that the demarcation of the border between data and metadata is important, but sometimes difficult to discern. By extension, we can say the same thing about the demarcation of the border between data and information.
So, in this blog post, I thought I would try to offer an explanation about the importance of these demarcations using potatoes.
You Say Potato and I Say Potahto
Let’s Call the Whole Thing Off was a song written by George Gershwin and Ira Gershwin, which became famous for its playful lyrics that poked fun at the differences in the pronunciation of words, such as “you say potato and I say potahto.”
Spelling and pronunciation are included in the dictionary definition of a word, which is a good example of one of the many uses of metadata, namely as a label that provides a definition, description, and context for data. Essentially, metadata describes data, and since data is attempting to describe a real world object, such as a potato, metadata is a further abstraction from reality.
And as we saw with the example of white horses in my blog post The Metadata Crisis, these abstract definitions can also include additional classifications (e.g., there are over 4,000 different varieties of potato), which also have to be well defined in order to facilitate clear communication and effective discussion. These levels of abstractions, definitions, and classifications are essential to our attempts to understand, and do business with, the real world. And this challenge continues even further with information.
You Say Potato and I Say Tater Tot
The difference, and relationship, between data and information is a common debate. Not only do these two terms have varying definitions, but they are often used interchangeably. Just a few examples include comparing and contrasting data quality with information quality, data management with information management, and data governance with information governance.
Some consider this an esoteric debate between data geeks and information nerds, but what is not debated is the importance of understanding how organizations use data and/or information to support their business activities.
Extending my analogy, data is like a potato and information is like a tater tot. In other words, information is one of the many possible specific uses for data. Information is one of the many possible specific things that we can make using data, which is why information quality professionals often speak about the information product.
So it’s important to remember that we can’t have a tater tot (information) without a potato (data), and that we can’t have either a tater tot or a potato without having a working definition (metadata) of what a potato is.
Let’s Not Call the Whole Thing Data
David Corrigan recently blogged about the importance of the metadata that tracks the lineage of information presented to an end user, and how the root causes of data quality and data governance issues are impossible to discover without this metadata.
Therefore, the lines of demarcation separating metadata, data, and information are not just an esoteric technical debate. These demarcations are foundational to the efficiency and effectiveness of business operations. So, let’s not call the whole thing data.
Let’s acknowledge the separate, but deeply interrelated, continuum formed by the disciplines of metadata, data, and information.