Pondering whether the blemishing effect could positively affect your efforts to make the business case for a data quality initiative.Read More
Over the past week, an excellent meme has been making its way around the data quality blogosphere. It all started, as many of the best data quality blogging memes do, with a post written by Henrik Liliendahl Sørensen.
In Turning a Blind Eye to Data Quality, Henrik blogged about how, as data quality practitioners, we are often amazed by the inconvenient truth that our organizations are capable of growing as a successful business even despite the fact that they often turn a blind eye to data quality by ignoring data quality issues and not following the data quality best practices that we advocate.
“The evidence about how poor data quality is costing enterprises huge sums of money has been out there for a long time,” Henrik explained. “But business successes are made over and over again despite bad data. There may be casualties, but the business goals are met anyway. So, poor data quality is just something that makes the fight harder, not impossible.”
As data quality practitioners, we often don’t effectively sell the business benefits of data quality, but instead we often only talk about the negative aspects of not investing in data quality, which, as Henrik explained, is usually why business leaders turn a blind eye to data quality challenges. Henrik concluded with the recommendation that when we are talking with business leaders, we need to focus on “smaller, but tangible, wins where data quality improvement and business efficiency goes hand in hand.”
Is Data Quality a Journey or a Destination?
Henrik’s blog post received excellent comments, which included a debate about whether data quality is a journey or a destination.
Garry Ure responded with his blog post Destination Unknown, in which he explained how “historically the quest for data quality was likened to a journey to convey the concept that you need to continue to work in order to maintain quality.” But Garry also noted that sometimes when an organization does successfully ingrain data quality practices into day-to-day business operations, it can make it seem like data quality is a destination that the organization has finally reached.
Garry concluded data quality is “just one destination of many on a long and somewhat recursive journey. I think the point is that there is no final destination, instead the journey becomes smoother, quicker, and more pleasant for those traveling.”
Bryan Larkin responded to Garry with the blog post Data Quality: Destinations Known, in which Bryan explained, “data quality should be a series of destinations where short journeys occur on the way to those destinations. The reason is simple. If we make it about one big destination or one big journey, we are not aligning our efforts with business goals.”
In order to do this, Bryan recommends that “we must identify specific projects that have tangible business benefits (directly to the bottom line — at least to begin with) that are quickly realized. This means we are looking at less of a smooth journey and more of a sprint to a destination — to tackle a specific problem and show results in a short amount of time. Most likely we’ll have a series of these sprints to destinations with little time to enjoy the journey.”
“While comprehensive data quality initiatives,” Bryan concluded, “are things we as practitioners want to see — in fact we build our world view around such — most enterprises (not all, mind you) are less interested in big initiatives and more interested in finite, specific, short projects that show results. If we can get a series of these lined up, we can think of them more in terms of an overall comprehensive plan if we like — even a journey. But most functional business staff will think of them in terms of the specific projects that affect them.”
The Latin phrase Quo Vadimus? translates into English as “Where are we going?” When I ponder where data quality is going, and whether data quality is a journey or a destination, I am reminded of the words of T.S. Eliot:
“We must not cease from exploration and the end of all our exploring will be to arrive where we began and to know the place for the first time.”
We must not cease from exploring new ways to continuously improve our data quality and continuously put into practice our data governance principles, policies, and procedures, and the end of all our exploring will be to arrive where we began and to know, perhaps for the first time, the value of high-quality data to our enterprise’s continuing journey toward business success.
In my previous post, I took a slightly controversial stance on a popular three-word phrase — Root Cause Analysis. In this post, it’s another popular three-word phrase — Return on Investment (most commonly abbreviated as the acronym ROI).
Zero. Zip. Zilch. Intet. Ingenting. Rien. Nada. Nothing. Nichts. Niets. Null. Niente. Bupkis.
There is No Such Thing as the ROI of purchasing a data quality tool or launching a data governance program.
Before you hire “The Butcher” to eliminate me for being The Man Who Knew Too Little about ROI, please allow me to explain.
Returns only come from Investments
Although the reason that you likely purchased a data quality tool is because you have business-critical data quality problems, simply purchasing a tool is not an investment (unless you believe in Magic Beans) since the tool itself is not a solution.
You use tools to build, test, implement, and maintain solutions. For example, I spent several hundred dollars on new power tools last year for a home improvement project. However, I haven’t received any return on my home improvement investment for a simple reason — I still haven’t even taken most of the tools out of their packaging yet. In other words, I barely even started my home improvement project. It is precisely because I haven’t invested any time and effort that I haven’t seen any returns. And it certainly isn’t going to help me (although it would help Home Depot) if I believed buying even more new tools was the answer.
Although the reason that you likely launched a data governance program is because you have complex issues involving the intersection of data, business processes, technology, and people, simply launching a data governance program is not an investment since it does not conjure the three most important letters.
Data is only an Asset if Data is a Currency
In his book UnMarketing, Scott Stratten discusses this within the context of the ROI of social media (a commonly misunderstood aspect of social media strategy), but his insight is just as applicable to any discussion of ROI. “Think of it this way: You wouldn’t open a business bank account and ask to withdraw $5,000 before depositing anything. The banker would think you are a loony.”
Yet, as Stratten explained, people do this all the time in social media by failing to build up what is known as social currency. “You’ve got to invest in something before withdrawing. Investing your social currency means giving your time, your knowledge, and your efforts to that channel before trying to withdraw monetary currency.”
The same logic applies perfectly to data quality and data governance, where we could say it’s the failure to build up what I will call data currency. You’ve got to invest in data before you could ever consider data an asset to your organization. Investing your data currency means giving your time, your knowledge, and your efforts to data quality and data governance before trying to withdraw monetary currency (i.e., before trying to calculate the ROI of a data quality tool or a data governance program).
If you actually want to get a return on your investment, then actually invest in your data. Invest in doing the hard daily work of continuously improving your data quality and putting into practice your data governance principles, policies, and procedures.
Data is only an asset if data is a currency. Invest in your data currency, and you will eventually get a return on your investment.
You only get a return from something you actually invest in.
Gordon Hamilton emailed me with an excellent recommended topic for a data quality blog post:
“It always seems crazy to me that few executives base their ‘corporate wagers’ on the statistical research touted by data quality authors such as Tom Redman, Jack Olson and Larry English that shows that 15-45% of the operating expense of virtually all organizations is WASTED due to data quality issues.
So, if every organization is leaving 15-45% on the table each year, why don’t they do something about it? Philip Crosby says that quality is free, so why do the executives allow the waste to go on and on and on? It seems that if the shareholders actually think about the Data Quality Wager they might wonder why their executives are wasting their shares’ value. A large portion of that 15-45% could all go to the bottom line without a capital investment.
I’m maybe sounding a little vitriolic because I’ve been re-reading Deming’s Out of the Crisis and he has a low regard for North American industry because they won’t move beyond their short-term goals to build a quality organization, let alone implement Deming’s 14 principles or Larry English’s paraphrasing of them in a data quality context.”
The Data Quality Wager
Gordon Hamilton explained in his email that his reference to the Data Quality Wager was an allusion to Pascal’s Wager, but what follows is my rendering of it in a data quality context (i.e., if you don’t like what follows, please yell at me, not Gordon).
Although I agree with Gordon, I also acknowledge that convincing your organization to invest in data quality initiatives can be a hard sell. A common mistake is not framing the investment in data quality initiatives using business language such as mitigated risks, reduced costs, or increased revenue. I also acknowledge the reality of the fiscal calendar effect and how most initiatives increase short-term costs based on the long-term potential of eventually mitigating risks, reducing costs, or increasing revenue.
Short-term increased costs of a data quality initiative can include the purchase of data quality software and its maintenance fees, as well as the professional services needed for training and consulting for installation, configuration, application development, testing, and production implementation. And there are often additional short-term increased costs, both external and internal.
Please note that I am talking about the costs of proactively investing in a data quality initiative before any data quality issues have manifested that would prompt reactively investing in a data cleansing project. Although, either way, the short-term increased costs are the same, I am simply acknowledging the reality that it is always easier for a reactive project to get funding than it is for a proactive program to get funding—and this is obviously not only true for data quality initiatives.
Therefore, the organization has to evaluate the possible outcomes of proactively investing in data quality initiatives while also considering the possible existence of data quality issues (i.e., the existence of tangible business-impacting data quality issues):
- Invest in data quality initiatives + Data quality issues exist = Decreased risks and (eventually) decreased costs
- Invest in data quality initiatives + Data quality issues do not exist = Only increased costs — No ROI
- Do not invest in data quality initiatives + Data quality issues exist = Increased risks and (eventually) increased costs
- Do not invest in data quality initiatives + Data quality issues do not exist = No increased costs and no increased risks
Data quality professionals, vendors, and industry analysts all strongly advocate #1 — and all strongly criticize #3. (Additionally, since we believe data quality issues exist, most “orthodox” data quality folks generally refuse to even acknowledge #2 and #4.)
Unfortunately, when advocating #1, we often don’t effectively sell the business benefits of data quality, and when criticizing #3, we often focus too much on the negative aspects of not investing in data quality.
Only #4 “guarantees” neither increased costs nor increased risks by gambling on not investing in data quality initiatives based on the belief that data quality issues do not exist—and, by default, this is how many organizations make the Data Quality Wager.
How is your organization making the Data Quality Wager?
This recent tweet by Andy Bitterer of Gartner Research (and ANALYSTerical) sparked an interesting online discussion, which was vaguely reminiscent of the classic causality dilemma that is commonly stated as “which came first, the chicken or the egg?”
An E-mail from the Edge
On the same day I saw Andy’s tweet, I received an e-mail from a friend and fellow data quality consultant, who had just finished a master data management (MDM) and enterprise data warehouse (EDW) project, which had over 20 customer data sources.
Although he was brought onto the project specifically for data cleansing, he was told from the day of his arrival that because of time constraints, they decided against performing any data cleansing with their recently purchased data quality tool. Instead, they decided to use their data integration tool to simply perform the massive initial load into their new MDM hub and EDW.
But wait—the story gets even better. The very first decision this client made was to purchase a consolidated enterprise application development platform with seamlessly integrated components for data quality, data integration, and master data management.
So long before this client had determined their business need, they decided that they needed to build a new MDM hub and EDW, made a huge investment in an entire platform of technology, then decided to use only the basic data integration functionality.
However, this client was planning to use the real-time data quality and MDM services provided by their very powerful enterprise application development platform to prevent duplicates and any other bad data from entering the system after the initial load.
But, of course, no one on the project team was actually working on configuring any of those services, or even, for that matter, determining the business rules those services would enforce. Maybe the salesperson told them it was as easy as flipping a switch?
My friend (especially after looking at the data), preached data quality was a critical business need, but he couldn’t convince them, even despite taking the initiative to present the results of some quick data profiling, standardization, and data matching used to identify duplicate records within and across their primary data sources, which clearly demonstrated the level of poor data quality.
Although this client agreed that they definitely had some serious data issues, they still decided against doing any data cleansing and wanted to just get the data loaded. Maybe they thought they were loading the data into one of those self-healing databases?
The punchline—this client is a financial services institution with a business need to better identify their most valuable customers.
As my friend lamented at the end of his e-mail, why do clients often later ask why these types of projects fail?
Blind Vendor Allegiance
In his recent blog post Blind Vendor Allegiance Trumps Utility, Evan Levy examined this bizarrely common phenomenon of selecting a technology vendor without gathering requirements, reviewing product features, and then determining what tool(s) could best help build solutions for specific business problems—another example of the tool coming before the business need.
Evan was recounting his experiences at a major industry conference on MDM, where people were asking his advice on what MDM vendor to choose, despite admitting “we know we need MDM, but our company hasn’t really decided what MDM is.”
Furthermore, these prospective clients had decided to default their purchasing decision to the technology vendor they already do business with, in other words, “since we’re already a [you can just randomly insert the name of a large technology vendor here] shop, we just thought we’d buy their product—so what do you think of their product?”
“I find this type of question interesting and puzzling,” wrote Evan. “Why would anyone blindly purchase a product because of the vendor, rather than focusing on needs, priorities, and cost metrics? Unless a decision has absolutely no risk or cost, I’m not clear how identifying a vendor before identifying the requirements could possibly have a successful outcome.”
SaaS-y Data Quality on a Cloudy Business Day?
Emerging industry trends like open source, cloud computing, and software as a service (SaaS) are often touted as less expensive than traditional technology, and I have heard some use this angle to justify buying the tool before identifying the business need.
In his recent blog post Cloud Application versus On Premise, Myths and Realities, Michael Fauscette examined the return on investment (ROI) versus total cost of ownership (TCO) argument quite prevalent in the SaaS versus on premise software debate.
“Buying and implementing software to generate some necessary business value is a business decision, not a technology decision,” Michael concluded. “The type of technology needed to meet the business requirements comes after defining the business needs. Each delivery model has advantages and disadvantages financially, technically, and in the context of your business.”
So which came first, the Data Quality Tool or the Business Need?
This question is, of course, absurd because, in every rational theory, the business need should always come first. However, in predictably irrational real-world practice, it remains a classic causality dilemma for data quality related enterprise information initiatives such as data integration, master data management, data warehousing, business intelligence, and data governance.
But sometimes the data quality tool was purchased for an earlier project, and despite what some vendor salespeople may tell you, you don’t always need to buy new technology at the beginning of every new enterprise information initiative.
Whenever, and before defining your business need, you already have the technology in-house (or you have previously decided, often due to financial constraints, that you will need to build a bespoke solution), you still need to avoid technology bias.
Knowing how the technology works can sometimes cause a framing effect where your business need is defined in terms of the technology’s specific functionality, thereby framing the objective as a technical problem instead of a business problem.
Bottom line—your business problem should always be well-defined before any potential technology solution is evaluated.
Data Quality (DQ) View is an OCDQ regular segment. Each DQ-View is a brief video discussion of a data quality key concept.
When you present the business case for your data quality initiative to executive management and other corporate stakeholders, you need to demonstrate that poor data quality is not a myth—it is a real business problem that negatively impacts the quality of decision-critical enterprise information.
But a common mistake when selling the business benefits of data quality is focusing too much on the negative aspects of not investing in data quality. Although you would be telling the truth, nobody may want to believe things are as bad as you claim.
DQ-View: The Cassandra Effect
If you are having trouble viewing this video, then you can watch it on Vimeo by clicking on this link: DQ-View on Vimeo
In his book Purple Cow: Transform Your Business by Being Remarkable, Seth Godin used many interesting case studies of effective marketing. One of them was the United States Postal Services.
“Very few organizations have as timid an audience as the United States Postal Service,” explained Godin. “Dominated by conservative big customers, the Postal Service has a very hard time innovating. The big direct marketers are successful because they’ve figured out how to thrive under the current system. Most individuals are in no hurry to change their mailing habits, either.”
“The majority of new policy initiatives at the Postal Service are either ignored or met with nothing but disdain. But ZIP+4 was a huge success. Within a few years, the Postal Service diffused a new idea, causing a change in billions of address records in thousands of databases. How?”
Doesn’t this daunting challenge sound familiar? An initiative causing a change in billions of records across multiple databases?
Sounds an awful lot like a massive data cleansing project, doesn’t it? If you believe selling the business benefits of data quality, especially on such an epic scale, is easy to do, then stop reading right now—and please publish a blog post about how you did it.
Going Postal on the Business Benefits
Getting back to Godin’s case study, how did the United States Postal Service (USPS) sell the business benefits of ZIP+4?
“First, it was a game-changing innovation,” explains Godin. “ZIP+4 makes it far easier for marketers to target neighborhoods, and much faster and easier to deliver the mail. ZIP+4 offered both dramatically increased speed in delivery and a significantly lower cost for bulk mailers. These benefits made it worth the time it took mailers to pay attention. The cost of ignoring the innovation would be felt immediately on the bottom line.”
Selling the business benefits of data quality (or anything else for that matter) requires defining its return on investment (ROI), which always comes from tangible business impacts, such as mitigated risks, reduced costs, or increased revenue.
Reducing costs was a major selling point for ZIP+4. Additionally, it mitigated some of the risks associated with direct marketing campaigns, such as the ability to target neighborhoods more accurately, as well as reduce delays in postal delivery times.
However, perhaps the most significant selling point was that “the cost of ignoring the innovation would be felt immediately on the bottom line.” In other words, the USPS articulated very well that the cost of doing nothing was very tangible.
The second reason ZIP+4 was a huge success, according to Godin, was that the USPS “wisely singled out a few early adopters. These were individuals in organizations that were technically savvy and were extremely sensitive to both pricing and speed issues. These early adopters were also in a position to sneeze the benefits to other, less astute, mailers.”
Sneezing the benefits is a reference to another Seth Godin book, Unleashing the Ideavirus, where he explains how the most effective business ideas are the ones that spread. Godin uses the term ideavirus to describe an idea that spreads, and the term sneezers to describe the people who spread it.
In my blog post Sneezing Data Quality, I explained that it isn’t easy being sneezy, but true sneezers are the innovators and disruptive agents within an organization. They can be the catalysts for crucial changes in corporate culture.
However, just like with literal sneezing, it can get really annoying if it occurs too frequently.
To sell the business benefits, you need sneezers that will do such an exhilarating job championing the cause of data quality, that they will help cause the very idea of a sustained data quality program to go viral throughout your entire organization, thereby unleashing the Data Quality Ideavirus.
Getting Zippy with it
One of the most common objections to data quality initiatives, and especially data cleansing projects, is that they often produce considerable costs without delivering tangible business impacts and significant ROI.
One of the most common ways to attempt selling the business benefits of data quality is the ROI of removing duplicate records, which although sometimes significant (with high duplicate rates) in the sense of reduced costs on the redundant postal deliveries, it doesn’t exactly convince your business stakeholders and financial decision makers of the importance of data quality.
Therefore, it is perhaps somewhat ironic that the USPS story of why ZIP+4 was such a huge success, actually provides such a compelling case study for selling the business benefits of data quality.
However, we should all be inspired by “Zippy” (aka “Mr. Zip” – the USPS Zip Code mascot shown at the beginning of this post), and start “getting zippy with it” (not an official USPS slogan) when it comes to selling the business benefits of data quality:
- Define Data Quality ROI using tangible business impacts, such as mitigated risks, reduced costs, or increased revenue
- Articulate the cost of doing nothing (i.e., not investing in data quality) by also using tangible business impacts
- Select a good early adopter and recruit sneezers to Champion the Data Quality Cause by communicating your successes
What other ideas can you think of for getting zippy with it when it comes to selling the business benefits of data quality?
The information technology industry has a great fondness for enterprise-class solutions and TLAs (two or three letter acronyms): ERP (Enterprise Resource Planning), DW (Data Warehousing), BI (Business Intelligence), MDM (Master Data Management), DG (Data Governance), DQ (Data Quality), CDI (Customer Data Integration), CRM (Customer Relationship Management), PIM (Product Information Management), BPM (Business Process Management), etc. — and new TLAs are surely coming soon.
But there is one TLA to rule them all, one TLA to fund them, one TLA to bring them all and to the business bind them—ROI.
All enterprise-class solutions have one thing in common—they require a significant investment and total cost of ownership.
Most enterprise software/system licenses start in the six figures. Due in large part to vendor consolidation, many are embedded within a consolidated enterprise application development platform with seamlessly integrated components offering an end-to-end solution that pushes the license well into seven figures.
On top of the licensing, you have to add the annual maintenance fees, which are usually in the five figures—sometimes more.
Add to the total cost of the solution the professional services needed for training and consulting for installation, configuration, application development, testing, and production implementation, and you have another six figure annual investment.
With such a significant investment and total cost of ownership required, can enterprise-class solutions ever deliver ROI?
Should I refinance my mortgage?
As a quick (but relevant) tangent, let's use a simple analogy from the world of personal finance.
Similar to most homeowners, I get offers to refinance my mortgage all the time. A common example is an offer that states I can reduce my monthly payments by $200 by refinancing. Sounds great, $200 a month is an annual cost reduction of $2400.
However, this great deal includes $3000 in refinancing costs. Although I start paying $200 less a month immediately, I do not really start saving any money for 15 months, when the monthly “savings” break even with the $3000 in refinancing costs.
Of course, saying only 15 months is ignoring possible tax implications as well as lost interest or returns that I could have earned since the $3000 likely came from either a savings or an investment account.
Additionally, refinancing might not be a good idea if I plan to sell the house in less than 15 months. The $3000 could instead be invested in finishing my basement or repairing minor damages, which could help increase its value and therefore its sales price.
How does this analogy relate to enterprise-class solutions?
The Business Justification Paradox
Focusing solely on the technical features and ignoring the business benefits of an enterprise-class solution isn’t going to convince either the organization's executive management or its shareholders that the solution is required.
Therefore, emphasis has to placed on the need to make the business justification, where true ROI can only be achieved through tangible business impacts, such as mitigated risks, reduced costs, or increased revenues.
However, a legitimate business justification for any enterprise-class solution is often relatively easy to make.
The business justification paradox is that although an enterprise-class solution definitely has the long-term future potential to reduce costs, mitigate risks, and increase revenues, in the immediate future (and current fiscal year), it will only increase costs, decrease revenues, and therefore potentially increase risks.
In the mortgage analogy, the break even point on the opportunity cost of refinancing can be precisely calculated. Is it even possible to accurately estimate the break even point on the opportunity cost of implementing an enterprise-class solution?
Furthermore, true ROI obviously has to be at least estimated to exceed simply breaking even on the investment.
Given the reality that the longer an initiative takes, the more likely its funding will either be reduced or completely cut, many advocate an agile methodology, which targets iterative cycles quickly delivering small, but tangible value. However, the up-front costs of enterprise licenses and incremental costs of the ongoing efforts and maintenance still loom large on the balance sheet.
Even with “creative” accounting practices, the unquestionably real short-term “ROI high” of following an agile approach could still leave you “chasing the dragon” in search of at least breaking even on your enterprise-class solution's total cost of ownership.
A Call for Debate
My point in this blog post was neither to make the argument that organizations should not invest in enterprise-class solutions, nor to berate organizations for evaluating such possible investments using short-term thinking limited to the current fiscal year.
I am simply trying to encourage an open, honest, and healthy debate about the true ROI of enterprise-class solutions.
I am tired of hearing over-simplifications about how all you need to do is make a valid business justification, as well as attempting to decipher the mystical ROI and total cost of ownership calculations provided by vendors and industry analysts.
I am also tired of being told how emerging industry trends like open source, cloud computing, and software as a service (SaaS) are “less expensive” than traditional approaches. Perhaps that is true, but can they deliver enterprise-class solutions and ROI?
This blog post is a call for debate. Please post a comment. All viewpoints are welcome.