DQ-BE: Data Quality Airlines
Data Quality By Example (DQ-BE) is a new OCDQ segment that will provide examples of data quality key concepts.
“Good morning sir!” said the smiling gentleman behind the counter—and a little too cheerily for 5 o’clock in the morning. “Welcome to the check-in counter for Data Quality Airlines. My name is Edward. How may I help you today?”
“Good morning Edward,” I replied. “My name is John Smith. I am traveling to Boston today on flight number 221.”
“Thank you for choosing Data Quality Airlines!” responded Edward. “May I please see your driver’s license, passport, or other government issued photo identification so that I can verify your data accuracy.”
As I handed Edward my driver’s license, I explained “it’s an old photograph in which I was clean-shaven, wearing contact lenses, and ten pounds lighter” since I now had a full beard, was wearing glasses, and, to be honest, was actually thirty pounds heavier.
“Oh,” said Edward, his plastic smile morphing into a more believable and stern frown. “I am afraid you are on the No Fly List.”
“Oh, that’s right—because of my name being so common!” I replied while fumbling through my backpack, frantically searching for the piece of paper, which I then handed to Edward. “I’m supposed to give you my Redress Control Number.”
“Actually, you’re supposed to use your Redress Control Number when making your reservation,” Edward retorted.
“In other words,” I replied, while sporting my best plastic smile, “although you couldn’t verify the accuracy of my customer data when I made my reservation on-line last month, you were able to verify the authorization to immediately charge my credit card for the full price of purchasing a non-refundable plane ticket to fly on Data Quality Airlines.”
“I don’t appreciate your sense of humor,” replied Edward. “Everyone at Data Quality Airlines takes accuracy very seriously.”
Edward printed my boarding pass, wrote BCS on it in big letters, handed it to me, and with an even more plastic smile cheerily returning to his face, said: “Please proceed to the security checkpoint. Thank you again for choosing Data Quality Airlines!”
“Boarding pass?” asked the not-at-all smiling woman at the security checkpoint. After I handed her my boarding pass, she said, “And your driver’s license, passport, or other government issued photo identification so that I can verify your data accuracy.”
“I guess my verified data accuracy at the Data Quality Airlines check-in counter must have already expired,” I joked as I handed her my driver’s license. “It’s an old photograph in which I was clean-shaven, wearing contact lenses, and ten pounds lighter.”
The woman silently examined my boarding pass and driver’s license, circled BCS with a magic marker, and then shouted over her shoulder to a group of not-at-all smiling security personnel standing behind her: “Randomly selected security screening!”
One of them, a very large man, stepped toward me as the sound from the snap of the fresh latex glove he had just placed on his very large hand echoed down the long hallway that he was now pointing me toward. “Right this way sir,” he said with a smile.
Ten minutes later, as I slowly walked to the gate for Data Quality Airlines Flight Number 221 to Boston, the thought echoing through my mind was that there is no such thing as data accuracy—there are only verifiable assertions of data accuracy . . .
Related Posts
DQ-Tip: “There is no such thing as data accuracy...”
Why isn’t our data quality worse?
The Real Data Value is Business Insight
Is your data complete and accurate, but useless to your business?
Data Quality and the Cupertino Effect
DQ-Tip: “Data quality is primarily about context not accuracy...”



Jim Harris
Reader Comments (6)
Jim,
It's me again!
Nice that we can continue this discussion in a new blog post :-)
Reading this fairly carefully, I note that none of the data shown is actually shown to be inaccurate. Its use (to get onto a plane) is the issue - and isn't that information quality/use?
What you appear to be illustrating is whether the data belongs to the passenger checking in. The data itself is accurate (in this case) for the person who made the reservation. Or so one would assert.
But my main concern is that the airline is assuming that certain methods of checking data quality (a piece of paper containing a redress control number, whatever that is, and a full body cavity search) can verify data, whilst other resources (an old photograph) do not. If we accept your assertion about asserted data accuracy (which I don't, as you know), it will always be asserted, regardless of how many sources you can verify it against, because each source (bar one) is also using data which is only assertably accurate ....
But actually the real John Smith is in full ownership of the knowledge required to tell Data Quality Airlines which data is accurate, and any which is not. In that case accuracy is actual if John Smith chooses to provide accurate data. That data is no longer asserted.
Or am I missing something? After all, it's pretty early here :-)
Additional:
What this seems to be is an illustration of each company being able to set a level at which it accepts data as being accurate (and therefore of use) by setting validation tests.
This is not a measure of the implicit accuracy of the data, only of a level of the believability accepted within that company.
I don't think we can define "data quality" on this basis of believability (though I know many do) because it means a different definition of quality within every organization and business everywhere. I vote for more meaningful and descriptive definitions and less lumping together of such measures under an all encompassing title "data quality."
You don't mind me using your blog post as a brain storming white board, do you Jim? :-)
After 45 minutes of
sadismsquash with my trainer, my mind must revert back to assertions of data verity. How about this:As so often happens a man is marked as dead through a slopping of coffee on a clerk's keyboard somewhere, and this news cascades down the systems. John Smith, for it is he, only discovers this assertion of his demise himself when he receives a letter from the Social Security department saying "so sorry you're dead. Please pop in at your earliest convenience to discuss the immediate cessation of all payments to you."
John presents himself at the office. "Look, you've got a data quality problem. I'm alive."
Clerk checks computer. "No you're not sir. You died last week. This is verified in 17 systems."
"But I'm alive! Look at me! Touch me! I've even bought some latex gloves in case an internal cavity search is part of your verification procedure!"
Click click click. "Computer says no. Did you have a nice funeral sir ... or madam? Did it stay dry for you?" Clerk smiles, as he learned at customer service classes ...
Now, although one assertion is patently true and one is patently false, they are both assertions. If I turn off my common sense gene I can live with that. But behind both assertions is a fact, and that fact in and of itself has an accurate value.
Sometimes that is asserted correctly, sometimes incorrectly and often with a degree of correctness and/or incorrectness. But how to you verify which assertion is closest to accuracy? Are we to believe John Smith or 17 systems that state he's dead?
And if we can never be sure that an assertion represents a fact accurately, how can it ever be verified?
So if every rendering of a fact or occurrence as data is only an assertion of the accuracy of that fact or occurrence, and as that data is unverifiable, then data may have accuracy - we'll just never know it.
But if I turn my common sense gene back on, we know that there are sources where data can be verified, and once verified, I assert that that data is accurate and not just an assertion of accuracy.
For my own peace of mind I probably shouldn't read your blog posts, Jim ;-)
Hi Graham,
First of all, I definitely do not mind you using my blog post as a brain storming white board—in fact, I encourage it!
Although, I have to ask, when you are using it, does OCDQ become Obsessive-Compulsive Information Quality (OCIQ)?
;-)
Yes, from the airline's perspective, the use of the data describing John Smith could be considered information.
However, in my previous blog post(s), you have stated that accurate data is always fit for any purpose (i.e., for any use), and Paul Boal offered real-world alignment as an alternative definition of data quality.
As you noted, none of the data shown throughout the example is inaccurate.
However, to be obsessive-compulsive about data quality (and not just accuracy), the outdated photograph and missing Redress Control Number—an assigned record identifier from the United States Department of Homeland Security making it easier for people whose names are a false positive on the No Fly List to confirm their identity with airlines at check-in—could be considered data quality issues.
To Paul's point, I can not think of a better example of real-world alignment than when the real-world entity, in this case, an airline passenger and the airline's data describing their passenger, literally meet in the real world.
John Smith has provided accurate data to Data Quality Airlines, which they have maintained in their databases independent of any purposes/uses of that data.
However, all that accurate data isn't automatically allowing John Smith to board that plane.
So, I guess by question is this: Is Accuracy only a dimension of Information Quality?
Best Regards,
Jim
Is Accuracy only a dimension of Information Quality?
I refer the right honourable gentleman to the graphic on this page: http://grcdi.blogspot.com/2010/07/definition-drift.html
Great post Jim - I chuckled!
I flew last weekend and was actually thinking about this while standing in the molasses-moving TSA line. How DO they verify accuracy? We know there's some sort of SecureFlight list, but where does that data come from? How do we know it's trustworthy? What happens with false positives (or false negatives)?
Also, why do we have to show driver's license/boarding pass both at the entrance to the security line and 20 feet later in front of the x-ray scanners? I always want to ask WHY that's necessary - really, has anyone new hopped into line - but I fear the TSA doesn't tolerate any sort of common sense.