Data Confabulation in Business Intelligence

Jarrett Goldfedder recently asked the excellent question: When does Data become Too Much Information (TMI)?

We now live in a 24 hours a day, 7 days a week, 365 days a year world-wide whirlwind of constant information flow, where the very air we breath is literally teeming with digital data streams—continually inundating us with new information.

The challenge is our time is a zero-sum game, meaning for every new information source we choose, others are excluded.

There’s no way to acquire all available information.  And even if we somehow could, due to the limitations of human memory, we often don’t remember much of the new information we do acquire.  In my blog post Mind the Gap, I wrote about the need to coordinate our acquisition of new information with its timely and practical application.

So I definitely agree with Jarrett that the need to find the right amount of information appropriate for the moment is the needed (and far from easy) solution.  Since this is indeed the age of the data deluge and TMI, I fear that data-driven decision making may simply become intuition-driven decisions validated after the fact by selectively choosing the data that supports the decision already made.  The human mind is already exceptionally good at doing this—the term for it in psychology is confabulation.

Although, according to Wikipedia, the term can be used to describe neurological or psychological dysfunction, Jonathan Haidt explained in his book The Happiness Hypothesis, confabulation is frequently used by “normal” people as well.  For example, after buying my new smart phone, I chose to read only the positive online reviews about it, trying to make myself feel more confident I had made the right decision—and more capable of justifying my decision beyond saying I bought the phone that looked “cool.”

 

Data Confabulation in Business Intelligence

Data confabulation in business intelligence occurs when intuition-driven business decisions are claimed to be data-driven and justified after the fact using the results of selective post-decision data analysis.  This is even worse than when confirmation bias causes intuition-driven business decisions, which are justified using the results of selective pre-decision data analysis that only confirms preconceptions or favored hypotheses, resulting in potentially bad—albeit data-driven—business decisions.

My fear is that the data deluge will actually increase the use of both of these business decision-making “techniques” because they are much easier than, as Jarrett recommended, trying to make sense of the business world by gathering and sorting through as much data as possible, deriving patterns from the chaos and developing clear-cut, data-driven, data-justifiable business decisions.

But the data deluge generally broadcasts more noise than signal, and sometimes trying to get better data to make better decisions simply means getting more data, which often only delays or confuses the decision-making process, or causes analysis paralysis.

Can we somehow listen for decision-making insights among the cacophony of chaotic and constantly increasing data volumes?

I fear that the information overload of the data deluge is going to trigger an intuition override of data-driven decision making.

 

Related Posts

The Reptilian Anti-Data Brain

Data In, Decision Out

The Data-Decision Symphony

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

DQ-View: From Data to Decision

TDWI World Conference Orlando 2010

Hell is other people’s data

Mind the Gap

The Fragility of Knowledge

Alternatives to Enterprise Data Quality Tools

The recent analysis by Andy Bitterer of Gartner Research (and ANALYSTerical) about the acquisition of open source data quality tool DataCleaner by the enterprise data quality vendor Human Inference, prompted the following Twitter conversation:

Since enterprise data quality tools can be cost-prohibitive, more prospective customers are exploring free and/or open source alternatives, such as the Talend Open Profiler, licensed under the open source General Public License, or non-open source, but entirely free alternatives, such as the Ataccama DQ Analyzer.  And, as Andy noted in his analysis, both of these tools offer an easy transition to the vendors’ full-fledged commercial data quality tools, offering more than just data profiling functionality.

As Henrik Liliendahl Sørensen explained, in his blog post Data Quality Tools Revealed, data profiling is the technically easiest part of data quality, which explains the tool diversity, and early adoption of free and/or open source alternatives.

And there are also other non-open source alternatives that are more affordable than enterprise data quality tools, such as Datamartist, which combines data profiling and data migration capabilities into an easy-to-use desktop application.

My point is neither to discourage the purchase of enterprise data quality tools, nor promote their alternatives—and this blog post is certainly not an endorsement—paid or otherwise—of the alternative data quality tools I have mentioned simply as examples.

My point is that many new technology innovations originate from small entrepreneurial ventures, which tend to be specialists with a narrow focus that can provide a great source of rapid innovation.  This is in contrast to the data management industry trend of innovation via acquisition and consolidation, embedding data quality technology within data management platforms, which also provide data integration and master data management (MDM) functionality as well, allowing the mega-vendors to offer end-to-end solutions and the convenience of one-vendor information technology shopping.

However, most software licenses for these enterprise data management platforms start in the six figures.  On top of the licensing, you have to add the annual maintenance fees, which are usually in the five figures.  Add to the total cost of the solution, the professional services that are needed for training and consulting for installation, configuration, application development, testing, and production implementation—and you have another six figure annual investment.

Debates about free and/or open source software usually focus on the robustness of functionality and the intellectual property of source code.  However, from my perspective, I think that the real reason more prospective customers are exploring these alternatives to enterprise data quality tools is because of the free aspect—but not because of the open source aspect.

In other words—and once again I am only using it as an example—I might download Talend Open Profiler because I wanted data profiling functionality at an affordable price—but not because I wanted the opportunity to customize its source code.

I believe the “try it before you buy it” aspect of free and/or open source software is what’s important to prospective customers.

Therefore, enterprise data quality vendors, instead of acquiring an open source tool as Human Inference did with DataCleaner, how about offering a free (with limited functionality) or trial version of your enterprise data quality tool as an alternative option?

 

Related Posts

Do you believe in Magic (Quadrants)?

Can Enterprise-Class Solutions Ever Deliver ROI?

Which came first, the Data Quality Tool or the Business Need?

Selling the Business Benefits of Data Quality

What Data Quality Technology Wants

Has Data Become a Four-Letter Word?

In her excellent blog post 'The Bad Data Ate My Homework' and Other IT Scapegoating, Loraine Lawson explained how “there are a lot of problems that can be blamed on bad data.  I suspect it would be fair to say that there’s a good percentage of problems we don’t even know about that can be blamed on bad data and a lack of data integration, quality and governance.”

Lawson examined whether bad data could have been the cause of the bank foreclosure fiasco, as opposed to, as she concludes, the more realistic causes being bad business and negligence, which, if not addressed, could lead to another global financial crisis.

“Bad data,” Lawson explained, “might be the most ubiquitous excuse since ‘the dog ate my homework.’  But while most of us would laugh at the idea of blaming the dog for missing homework, when someone blames the data, we all nod our heads in sympathy, because we all know how troublesome computers are.  And then the buck gets (unfairly) passed to IT.”

Unfairly blaming IT, or technology in general, when poor data quality negatively impacts business performance is ignoring the organization’s collective ownership of its problems, and its shared responsibility for the solutions to those problems, and causes, as Lawson explained in Data’s Conundrum: Everybody Wants Control, Nobody Wants Responsibility, an “unresolved conflict on both the business and the IT side over data ownership and its related issues, from stewardship to governance.”

In organizations suffering from this unresolved conflict between IT and the Business—a dysfunctional divide also known as the IT-Business Chasm—bad data becomes the default scapegoat used by both sides.

Perhaps, in a strange way, placing the blame on bad data is progress when compared with the historical notions of data denial, when an organization’s default was to claim that it had no data quality issues whatsoever.

However, admitting bad data not only exists, but that bad data is also having a tangible negative impact on business performance doesn’t seem to have motivated organizations to take action.  Instead, many appear to prefer practicing bad data blamestorming, where the Business blames bad data on IT and its technology, and IT blames bad data on the Business and its business processes.

Or perhaps, by default, everyone just claims that “the bad data ate my homework.”

Are your efforts to convince executive management that data needs to treated like a five-letter word (“asset”) being undermined by the fact that data has become a four-letter word in your organization?

 

Related Posts

The Business versus IT—Tear down this wall!

Quality and Governance are Beyond the Data

Data In, Decision Out

The Data-Decision Symphony

The Reptilian Anti-Data Brain

Hell is other people’s data

Promoting Poor Data Quality

Who Framed Data Entry?

Data, data everywhere, but where is data quality?

The Circle of Quality

Commendable Comments (Part 9)

Today is February 14 — Valentine’s Day — the annual celebration of enduring romance, where true love is publicly judged according to your willingness to purchase chocolate, roses, and extremely expensive jewelry, and privately judged in ways that nobody (and please, trust me when I say nobody) wants to see you post on Twitter, Facebook, Flickr, YouTube, or your blog.

This is the ninth entry in my ongoing series for expressing my true love to my readers for their truly commendable comments on my blog posts.  Receiving comments is the most rewarding aspect of my blogging experience.  Although I love all of my readers, I love my commenting readers most of all.

 

Commendable Comments

On Data Quality Industry: Problem Solvers or Enablers?, Henrik Liliendahl Sørensen commented:

“I sometimes compare our profession with that of dentists.  Dentists are also believed to advocate for good habits around your teeth, but are making money when these good habits aren’t followed.

So when 4 out 5 dentists recommend a certain toothpaste, it is probably no good :-)

Seriously though, I take the amount of money spent on data quality tools as a sign that organizations believe there are issues best solved with technology.  Of course these tools aren’t magic.

Data quality tools only solve a certain part of your data and information related challenges.  On the other hand, the few problems they do solve may be solved very well and cannot be solved by any other line of products or in any practical way by humans in any quantity or quality.”

On Data Quality Industry: Problem Solvers or Enablers?, Jarrett Goldfedder commented:

“I think that the expectations of clients from their data quality vendors have grown tremendously over the past few years.  This is, of course, in line with most everything in the Web 2.0 cloud world that has become point-and-click, on-demand response.

In the olden days of 2002, I remember clients asking for vendors to adjust data only to the point where dashboard statistics could be presented on a clean Java user interface.  I have noticed that some clients today want the software to not just run customizable reports, but to extract any form of data from any type of database, to perform advanced ETL and calculations with minimal user effort, and to be easy to use.  It’s almost like telling your dentist to fix your crooked teeth with no anesthesia, no braces, no pain, during a single office visit.

Of course, the reality today does not match the expectation, but data quality vendors and architects may need to step up their game to remain cutting edge.”

On Data Quality is not an Act, it is a Habit, Rob Paller commented:

“This immediately reminded me of the practice of Kaizen in the manufacturing industry.  The idea being that continued small improvements yield large improvements in productivity when compounded.

For years now, many of the thought leaders have preached that projects from business intelligence to data quality to MDM to data governance, and so on, start small and that by starting small and focused, they will yield larger benefits when all of the small projects are compounded.

But the one thing that I have not seen it tied back to is the successes that were found in the leaders of the various industries that have adopted the Kaizen philosophy.

Data quality practitioners need to recognize that their success lies in the fundamentals of Kaizen: quality, effort, participation, willingness to change, and communication. The fundamentals put people and process before technology.  In other words, technology may help eliminate the problem, but it is the people and process that allow that elimination to occur.”

On Data Quality is not an Act, it is a Habit, Dylan Jones commented:

“Subtle but immensely important because implementing a coordinated series of small, easily trained habits can add up to a comprehensive data quality program.

In my first data quality role we identified about ten core habits that everyone on the team should adopt and the results were astounding.  No need for big programs, expensive technology, change management and endless communication, just simple, achievable habits that importantly were focused on the workers.

To make habits work they need the WIIFM (What’s In It For Me) factor.”

On Darth Data, Rob Drysdale commented:

“Interesting concept about using data for the wrong purpose.  I think that data, if it is the ‘true’ data can be used for any business decision as long as it is interpreted the right way.

One problem is that data may have a margin of error associated with it and this must be understood in order to properly use it to make decisions.  Another issue is that the underlying definitions may be different.

For example, an organization may use the term ‘customer’ when it means different things.  The marketing department may have a list of ‘customers’ that includes leads and prospects, but the operational department may only call them ‘customers’ when they are generating revenue.

Each department’s data and interpretation of it is correct for their own purpose, but you cannot mix the data or use it in the ‘other’ department to make decisions.

If all the data is correct, the definitions and the rules around capturing it are fully understood, then you should be able to use it to make any business decision.

But when it gets misinterpreted and twisted to suit some business decision that it may not be suited for, then you are crossing over to the Dark Side.”

On Data Governance and the Social Enterprise, Jacqueline Roberts commented:

“My continuous struggle is the chaos of data electronically submitted by many, many sources, different levels of quality and many different formats while maintaining the history of classification, correction, language translation, where used, and a multitude of other ‘data transactions’ to translate this data into usable information for multi-business use and reporting.  This is my definition of Master Data Management.

I chuckled at the description of the ‘rigid business processes’ and I added ‘software products’ to the concept, since the software industry must understand the fluidity of the change of data to address the challenges of Master Data Management, Data Governance, and Data Cleansing.”

On Data Governance and the Social Enterprise, Frank Harland commented: 

“I read: ‘Collaboration is the key to business success. This essential collaboration has to be based on people, and not on rigid business processes . . .’

And I think: Collaboration is the key to any success.  This must have been true since the time man hunted the Mammoth.  When collaborating, it went a lot better to catch the bugger.

And I agree that the collaboration has to be based on people, and not on rigid business processes.  That is as opposed to based on rigid people, and not on flexible business processes. All the truths are in the adjectives.

I don’t mean to bash, Jim, I think there is a lot of truth here and you point to the exact relationship between collaboration as a requirement and Data Governance as a prerequisite.  It’s just me getting a little tired of Gartner saying things of the sort that ‘in order to achieve success, people should work together. . .’

I have a word in mind that starts with ‘du’ and ends with ‘h’ :-)”

On Quality and Governance are Beyond the Data, Milan Kučera commented:

“Quality is a result of people’s work, their responsibility, improvement initiatives, etc.  I think it is more about the company culture and its possible regulation by government.  It is the most complicated to set-up a ‘new’ (information quality) culture, because of its influence on every single employee.  It is about well balanced information value chain and quality processes at every ‘gemba’ where information is created.

Confidence in the information is necessary because we make many decisions based on it.  Sometimes we do better or worse then before.  We should store/use as much accurate information as possible.

All stewardship or governance frameworks should help companies with the change of its culture, define quality measures (the most important is accuracy), cost of poor quality system (allowing them to monitor impacts of poor quality information), and other necessary things.  Only at this moment would we be able to trust corporate information and make decisions.

A small remark on technology only.  Data quality technology is a good tool for helping you to analyze ‘technical’ quality of data – patterns, business rules, frequencies, NULL or Not NULL values, etc.  Many technology companies narrow information quality into an area of massive cleansing (scrap/rework) activities.  They can correct some errors but everything in general leads to a higher validity, but not information accuracy.  If cleansing is implemented as a regular part of the ETL processes then the company institutionalizes massive correction, which is only a cost adding activity and I am sure it is not the right place to change data contents – we increase data inconsistency within information systems.

Every quality management system (for example TQM, TIQM, Six Sigma, Kaizen) focuses on improvement at the place where errors occur – gemba.  All those systems require: leaders, measures, trained people, and simply – adequate culture.

Technology can be a good assistant (helper), but a bad master.”

On Can Data Quality avoid the Dustbin of History?, Vish Agashe commented:

“In a sense, I would say that the current definitions and approaches of/towards data quality might very well not be able to avoid the Dustbin of History.

In the world of phones and PDAs, quality of information about environments, current fashions/trends, locations and current moods of the customer might be more important than a single view of customer or de-duped customers.  The pace at which consumer’s habits are changing, it might be the quality of information about the environment in which the transaction is likely to happen that will be more important than the quality of the post transaction data itself . . . Just a thought.”

On Does your organization have a Calumet Culture?, Garnie Bolling commented:

“So true, so true, so true.

I see this a lot.  Great projects or initiatives start off, collaboration is expected across organizations, and there is initial interest, big meetings / events to jump start the Calumet.  Now what, when the events no longer happen, funding to fly everyone to the same city to bond, share, explore together dries up.

Here is what we have seen work. After the initial kick off, have small events, focus groups, and let the Calumet grow organically. Sometimes after a big powwow, folks assume others are taking care of the communication / collaboration, but with a small venue, it slowly grows.

Success breeds success and folks want to be part of that, so when the focus group achieves, the growth happens.  This cycle is then repeated, hopefully.

While it is important for folks to come together at the kick off to see the big picture, it is the small rolling waves of success that will pick up momentum, and people will want to join the effort to collaborate versus waiting for others to pick up the ball and run.

Thanks for posting, good topic.  Now where is my small focus group? :-)”

You Are Awesome

Thank you very much for sharing your perspectives with our collablogaunity.  This entry in the series highlighted the commendable comments received on OCDQ Blog posts published in October, November, and December of 2010.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please keep on commenting and stay tuned for future entries in the series.

By the way, even if you have never posted a comment on my blog, you are still awesome — feel free to tell everyone I said so.

 

Related Posts

Commendable Comments (Part 8)

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5)

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

Spartan Data Quality

My recent Twitter conservation with Dylan Jones, Henrik Liliendahl Sørensen, and Daragh O Brien was sparked by the blog post Case study with Data blogs, from 300 to 1000, which included a list of the top 500 data blogs ranked by influence.

Data Quality Pro was ranked #57, Liliendahl on Data Quality was ranked #87, The DOBlog was a glaring omission, and I was proud OCDQ Blog was ranked #33 – at least until, being the data quality geeks we are, we noticed that it was also ranked #165.

In other words, there was an ironic data quality issue—a data quality blog was listed twice (i.e., a duplicate record in the list)!

Hilarity ensued, including some epic photo shopping by Daragh, leading, quite inevitably, to the writing of this Data Quality Tale, which is obviously loosely based on the epic movie 300—and perhaps also the epically terrible comedy Meet the Spartans.  Enjoy!

 

Spartan Data Quality

In 1989, an alliance of Data Geeks, lead by the Spartans, an unrivaled group of data quality warriors, battled against an invading data deluge in the mountain data center of Thermopylae, caused by the complexities of the Greco-Persian Corporate Merger.

Although they were vastly outnumbered, the Data Geeks overcame epic data quality challenges in one of the most famous enterprise data management initiatives in history—The Data Integration of Thermopylae.

This is their story.

Leonidas, leader of the Spartans, espoused an enterprise data management approach known as Spartan Data Quality, defined by its ethos of collaboration amongst business, data, and technology experts, collectively and affectionately known as Data Geeks.

Therefore, Leonidas was chosen as the Thermopylae Project Lead.  However, Xerxes, the new Greco-Persian CIO, believed that the data integration project was pointless, Spartan Data Quality was a fool’s errand, and the technology-only Persian approach, known as Magic Beans, should be implemented instead.  Xerxes saw the Thermopylae project as an unnecessary sacrifice.

“There will be no glory in your sacrifice,” explained Xerxes.  “I will erase even the memory of Sparta from the database log files!  Every bit and byte of Data Geek tablespace shall be purged.  Every data quality historian and every data blogger shall have their Ethernet cables pulled out, and their network connections cut from the Greco-Persian mainframe.  Why, uttering the very name of Sparta, or Leonidas, will be punishable by employee termination!  The corporate world will never know you existed at all!”

“The corporate world will know,” replied Leonidas, “that Data Geeks stood against a data deluge, that few stood against many, and before this battle was over, a CIO blinded by technology saw what it truly takes to manage data as a corporate asset.”

Addressing his small army of 300 Data Geeks, Leonidas declared: “Gather round!  No retreat, no surrender.  That is Spartan law.  And by Spartan law we will stand and fight.  And together, united by our collaboration, our communication, our transparency, and our trust in each other, we shall overcome this challenge.”

“A new Information Age has begun.  An age of data-driven business decisions, an age of data-empowered consumers, an age of a world connected by a web of linked data.  And all will know, that 300 Data Geeks gave their last breath to defend it!”

“But there will be so many data defects, they will blot out the sun!” exclaimed Xerxes.

“Then we will fight poor data quality in the shade,” Leonidas replied, with a sly smile.

“This is madness!” Xerxes nervously responded as the new servers came on-line in the data center of Thermopylae.

“Madness?  No,” Leonidas calmly said as the first wave of the data deluge descended upon them.  “THIS . . . IS . . . DATA !!!”

 

Related Posts

Pirates of the Computer: The Curse of the Poor Data Quality

Video: Oh, the Data You’ll Show!

The Quest for the Golden Copy (Part 1)

The Quest for the Golden Copy (Part 2)

The Quest for the Golden Copy (Part 3)

The Quest for the Golden Copy (Part 4)

‘Twas Two Weeks Before Christmas

My Own Private Data

The Tell-Tale Data

Data Quality is People!

#FollowFriday Spotlight: @PhilSimon

FollowFriday Spotlight is an OCDQ regular segment highlighting someone you should follow—and not just Fridays on Twitter.


Phil Simon is an independent technology consultant, author, writer, and dynamic public speaker for hire, who focuses on the intersection of business and technology.  Phil is the author of three books (see below for more details) and also writes for a number of technology media outlets and sites, and hosts the podcast Technology Today.

As an independent consultant, Phil helps his clients optimize their use of technology.  Phil has cultivated over forty clients in a wide variety of industries, including health care, manufacturing, retail, education, telecommunications, and the public sector.

When not fiddling with computers, hosting podcasts, putting himself in comics, and writing, Phil enjoys English Bulldogs, tennis, golf, movies that hurt the brain, fantasy football, and progressive rock.  Phil is a particularly zealous fan of Rush, Porcupine Tree, and Dream Theater.  Anyone who reads his blog posts or books will catch many references to these bands.

 

Books by Phil Simon

My review of The New Small:

By leveraging what Phil Simon calls the Five Enablers (Cloud computing, Software-as-a-Service (SaaS), Free and open source software (FOSS), Mobility, Social technologies), small businesses no longer need to have technology as one of their core competencies, nor invest significant time and money in enabling technology, which allows them to focus on their true core competencies and truly compete against companies of all sizes.

The New Small serves as a practical guide to this brave new world of small business.

 

My review of The Next Wave of Technologies:

The constant challenge faced by organizations, large and small, which are using technology to support the ongoing management of their decision-critical information, is that the business world of information technology can never afford to remain static, but instead, must dynamically evolve and adapt, in order to protect and serve the enterprise’s continuing mission to survive and thrive in today’s highly competitive and rapidly changing marketplace.


The Next Wave of Technologies is required reading if your organization wishes to avoid common mistakes and realize the full potential of new technologies—especially before your competitors do.

 

My review of Why New Systems Fail:

Why New Systems Fail is far from a doom and gloom review of disastrous projects and failed system implementations.  Instead, this book contains numerous examples and compelling case studies, which serve as a very practical guide for how to recognize, and more importantly, overcome the common mistakes that can prevent new systems from being successful.

Phil Simon writes about these complex challenges in a clear and comprehensive style that is easily approachable and applicable to diverse audiences, both academic and professional, as well as readers with either a business or a technical orientation.

 

Blog Posts by Phil Simon

In addition to his great books, Phil is a great blogger.  For example, check out these brilliant blog posts written by Phil Simon:

 

Knights of the Data Roundtable

Phil Simon and I co-host and co-produce the wildly popular podcast Knights of the Data Roundtable, a bi-weekly data management podcast sponsored by the good folks at DataFlux, a SAS Company.

The podcast is a frank and open discussion about data quality, data integration, data governance and all things related to managing data.

 

Related Posts

#FollowFriday Spotlight: @hlsdk

#FollowFriday Spotlight: @DataQualityPro

#FollowFriday and Re-Tweet-Worthiness

#FollowFriday and The Three Tweets

Dilbert, Data Quality, Rabbits, and #FollowFriday

Twitter, Meaningful Conversations, and #FollowFriday

The Fellowship of #FollowFriday

Social Karma (Part 7) – Twitter

DQ-BE: Dear Valued Customer

Data Quality By Example (DQ-BE) is an OCDQ regular segment that provides examples of data quality key concepts.

The term “valued customer” is bandied about quite frequently and is often at the heart of enterprise data management initiatives such as Customer Data Integration (CDI), 360° Customer View, and Customer Master Data Management (MDM).

The role of data quality in these initiatives is an important, but sometimes mistakenly overlooked, consideration.

For example, the Service Contract Renewal Notice (shown above) I recently received exemplifies the impact of poor data quality on Customer Relationship Management (CRM) since one of my service providers wants me—as a valued customer—to purchase a new service contract for one of my laptop computers.

Let’s give them props for generating a 100% accurate residential postal address, since how could I even consider renewing my service contract if I don’t receive the renewal notice in the mail?  Let’s also acknowledge my Customer ID is also 100% accurate, since that is the “unique identifier” under which I have purchased all of my products and services from this company.

However, the biggest data quality mistake is that the name of their “Valued Customer” is not INDEPENDENT CONSULTANT.  (And they get bonus negative points for writing it in ALL CAPS).

The moral of the story is that if you truly value your customers, then you should truly value your customer data quality.

At the very least—get your customer’s name right.

 

Related Posts

Customer Incognita

Identifying Duplicate Customers

Adventures in Data Profiling (Part 7) – Customer Name

The Quest for the Golden Copy (Part 3) – Defining “Customer”

‘Tis the Season for Data Quality

The Seven Year Glitch

DQ-IRL (Data Quality in Real Life)

Data Quality, 50023

Once Upon a Time in the Data

The Semantic Future of MDM

The People Platform

Platforms are popular in enterprise data management.  Most of the time, the term is used to describe a technology platform, an integrated suite of tools that enables the organization to manage its data in support of its business processes.

Other times the term is used to describe a methodology platform, an integrated set of best practices that enables the organization to manage its data as a corporate asset in order to achieve superior business performance.

Data governance is an example of a methodology platform, where one of its central concepts is the definition, implementation, and enforcement of policies, which govern the interactions between business processes, data, technology, and people.

But many rightfully lament the misleading term “data governance” because it appears to put the emphasis on data, arguing that since business needs come first in every organization, data governance should be formalized as a business process, and therefore mature organizations should view data governance as business process management.

However, successful enterprise data management is about much more than data, business processes, or enabling technology.

Business process management, data quality management, and technology management are all people-driven activities because people empowered by high quality data, enabled by technology, optimize business processes for superior business performance.

Data governance policies illustrate the intersection of business, data, and technical knowledge, which is spread throughout the enterprise, transcending any artificial boundaries imposed by an organizational chart, where different departments or different business functions appear as if they were independent of the rest of the organization.

Data governance policies reveal how truly interconnected and interdependent the organization is, and how everything that happens within the organization happens as a result of the interactions occurring among its people.

Michael Fauscette defines people-centricity as “our current social and business progression past the industrial society’s focus on business, technology, and process.  Not that business or technology or process go away, but instead they become supporting structures that facilitate new ways of collaborating and interacting with customers, suppliers, partners, and employees.”

In short, Fauscette believes people are becoming the new enterprise platform—and not just for data management.

I agree, but I would argue that people have always been—and always will be—the only successful enterprise platform.

 

Related Posts

The Collaborative Culture of Data Governance

Data Governance and the Social Enterprise

Connect Four and Data Governance

What Data Quality Technology Wants

Data and Process Transparency

The Business versus IT—Tear down this wall!

Collaboration isn’t Brain Surgery

Trust is not a checklist

Quality and Governance are Beyond the Data

Data Transcendentalism

Podcast: Data Governance is Mission Possible

Video: Declaration of Data Governance

DQ-View: The Poor Data Quality Blizzard

Data Quality (DQ) View is an OCDQ regular segment.  Each DQ-View is a brief video discussion of a data quality key concept.

 

If you are having trouble viewing this video, then you can watch it on Vimeo by clicking on this link: DQ-View on Vimeo

 

Related Posts

DQ-View: New Data Resolutions

DQ-View: From Data to Decision

DQ View: Achieving Data Quality Happiness

Data Quality is not a Magic Trick

DQ-View: The Cassandra Effect

DQ-View: Is Data Quality the Sun?

DQ-View: Designated Asker of Stupid Questions

Video: Oh, the Data You’ll Show!

Data In, Decision Out

This recent blog post by Seth Godin made me think about the data quality adage garbage in, garbage out (aka GIGO).

Since we live in the era of data deluge and information overload, Godin’s question about how much time and effort should be spent on absorbing data and how much time and effort should be invested in producing output is an important one, especially for enterprise data management, where it boils down to how much data should be taken in before a business decision can come out.

In other words, it’s about how much time and effort is invested in the organization’s data in, decision out (i.e., DIDO) process.

And, of course, quality is an important aspect of the DIDO process—both data quality and decision quality.  But, oftentimes, it is an organization’s overwhelming concerns about its GIGO that lead to inefficiencies and ineffectiveness around its DIDO.

How much data is necessary to make an effective business decision?  Having complete (i.e., all available) data seems obviously preferable to incomplete data.  However, with data volumes always burgeoning, the unavoidable fact is that sometimes having more data only adds confusion instead of clarity, thereby becoming a distraction instead of helping you make a better decision.

Although accurate data is obviously preferable to inaccurate data, less than perfect data quality can not be used as an excuse to delay making a business decision.  Even large amounts of high quality data will not guarantee high quality business decisions, just as high quality business decisions will not guarantee high quality business results.

In other words, overcoming GIGO will not guarantee DIDO success.

When it comes to the amount and quality of the data used to make business decisions, you can’t always get the data you want, and while you should always be data-driven, never only intuition-driven, eventually it has to become: Time to start deciding.

 

Related Posts

The Data-Decision Symphony

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

DQ-View: From Data to Decision

TDWI World Conference Orlando 2010

The Asymptote of Data Quality

In analytic geometry (according to Wikipedia), an asymptote of a curve is a line such that the distance between the curve and the line approaches zero as they tend to infinity.  The inspiration for my hand-drawn illustration was a similar one (not related to data quality) in the excellent book Linchpin: Are You Indispensable? by Seth Godin, which describes an asymptote as:

“A line that gets closer and closer and closer to perfection, but never quite touches.”

“As you get closer to perfection,” Godin explains, “it gets more and more difficult to improve, and the market values the improvements a little bit less.  Increasing your free-throw percentage from 98 to 99 percent may rank you better in the record books, but it won’t win any more games, and the last 1 percent takes almost as long to achieve as the first 98 percent did.”

The pursuit of data perfection is a common debate in data quality circles, where it is usually known by the motto:

“The data will always be entered right, the first time, every time.”

However, Henrik Liliendahl Sørensen has cautioned that even when this ideal can be achieved, we must still acknowledge the inconvenient truth that things change, and Evan Levy has reminded us that data quality isn’t the same as data perfection, and David Loshin has used the Pareto principle to describe the point of diminishing returns in data quality improvements.

Chasing data perfection can be a powerful motivation, but it can also undermine the best of intentions.  Not only is it important to accept that the Asymptote of Data Quality can never be reached, but we must realize that data perfection was never the goal.

The goal is data-driven solutions for business problems—and these dynamic problems rarely have (or require) a perfect solution.

Data quality practitioners must strive for continuous data quality improvement, but always within the business context of data, and without losing themselves in the pursuit of a data-myopic ideal such as data perfection.

 

Related Posts

To Our Data Perfectionists

The Data-Decision Symphony

Is your data complete and accurate, but useless to your business?

Finding Data Quality

MacGyver: Data Governance and Duct Tape

You Can’t Always Get the Data You Want

What going to the dentist taught me about data quality

A Tale of Two Q’s

Data Quality and The Middle Way

Hyperactive Data Quality (Second Edition)

Missed It By That Much

The Data Quality Goldilocks Zone

Data and Process Transparency

Illustration via the SlideShare presentation: The Social Intranet

How do you know if you have poor data quality?

How do you know what your business processes and technology are doing to your data?

Waiting for poor data quality to reveal itself is like waiting until the bread pops up to see if you burnt your toast, at which point it is too late to save the bread—after all, it’s not like you can reactively cleanse the burnt toast.

Extending the analogy, let’s imagine that the business process is toasting, the technology is the toaster, and the data is the toast, which is being prepared for an end user.  (We could also imagine that the data is the bread and information is the toast.)

A more proactive approach to data quality begins with data and process transparency, which can help you monitor the quality of your data in much the same way as a transparent toaster could help you monitor your bread during the toasting process.

Performing data profiling and data quality assessments can provide insight into the quality of your data, but these efforts must include identifying the related business processes, technology, and end users of the data being analyzed.

However, the most important aspect is to openly share this preliminary analysis of the data, business, and technology landscape since it provides detailed insights about potential problems, which helps the organization better evaluate possible solutions.

Data and process transparency must also be maintained as improvement initiatives are implemented.  Regularly repeat the cycle of analysis and publication of its findings, which provides a feedback loop for tracking progress and keeping everyone informed.

The downside of transparency is that it can reveal how bad things are, but without this awareness, improvement is not possible.

 

Related Posts

Video: Oh, the Data You’ll Show!

Finding Data Quality

Why isn’t our data quality worse?

Days Without A Data Quality Issue

The Diffusion of Data Governance

Adventures in Data Profiling (Part 8)

Schrödinger’s Data Quality

Data Gazers

#FollowFriday Spotlight: @hlsdk

FollowFriday Spotlight is an OCDQ regular segment highlighting someone you should follow—and not just Fridays on Twitter.

Henrik Liliendahl Sørensen is a data quality and master data management (MDM) professional with over 30 years of experience in the information technology (IT) business working within a large range of business areas, such as government, insurance, manufacturing, membership, healthcare, and public transportation.

For more details about what Henrik has been, and is, working on, check out his My Been Done List and 2011 To Do List.

Henrik is also a charter member of the IAIDQ, and the creator of the LinkedIn Group for Data Matching for people interested in data quality and thrilled by automated data matching, deduplication, and identity resolution.

Henrik is one of the most prolific and popular data quality bloggers, regularly sharing his excellent insights about data quality, data matching, MDM, data architecture, data governance, diversity in data quality, and many other data management topics.

So check out Liliendahl on Data Quality for great blog posts written by Henrik Liliendahl Sørensen, such as these popular posts:

 

Related Posts

Delivering Data Happiness

#FollowFriday Spotlight: @DataQualityPro

#FollowFriday and Re-Tweet-Worthiness

#FollowFriday and The Three Tweets

Dilbert, Data Quality, Rabbits, and #FollowFriday

Twitter, Meaningful Conversations, and #FollowFriday

The Fellowship of #FollowFriday

Social Karma (Part 7) – Twitter

Connect Four and Data Governance

Connect Four was one of my favorite childhood games (I grew up in the early 1970s before video games and home computers).

The object of the game was to connect four of your checkers next to each other either vertically, horizontally, or diagonally, before your opponent could do the same with their checkers.  Hours of fun for ages 7 and up, as Milton Bradley would say.

Data Governance has its own version of Connect Four.

The central concept of data governance is its definition, implementation, and enforcement of policies, which connect four factors:

  1. People
  2. Business Process
  3. Technology
  4. Data

Data governance policies govern the complex interactions among people, business processes, technology, and data, which is a corporate asset because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to optimize business processes for superior business performance.

Connecting all four of these factors both vertically (within each business unit) and horizontally (across every business unit) is the only winning strategy for long-term success.

Data governance is not as simple (or as fun) as a board game, but if your data governance board doesn’t play Connect Four, then it could be Game Over for much more than just your data governance program:

Photo via Flickr by: Jeff Golden

 

Related Posts

Data Governance and the Social Enterprise

Podcast: Data Governance is Mission Possible

Quality and Governance are Beyond the Data

Video: Declaration of Data Governance

The Diffusion of Data Governance

Jack Bauer and Enforcing Data Governance Policies

The Prince of Data Governance

MacGyver: Data Governance and Duct Tape