HoardaBytes and the Big Data Lebowski

Gartnet Chat on Big Data 2.jpg

The recent #GartnerChat on Big Data was an excellent Twitter discussion about what I often refer to as the Seven Letter Tsunami of the data management industry, which as Gartner Research explains, although the term acknowledges the exponential growth, availability, and use of information in today’s data-rich landscape, big data is about more than just data volume.  Data variety (i.e., structured, semi-structured, and unstructured data, as well as other types, such as the sensor data emanating from the Internet of Things), and data velocity (i.e., how fast data is produced and how fast data must be processed to meet demand) are also key characteristics of the big challenges associated with the big buzzword that big data has become over the last year.

Since ours is an industry infatuated with buzzwords, Timo Elliott remarked “new terms arise because of new technology, not new business problems.  Big Data came from a need to name Hadoop [and other technologies now being relentlessly marketed as big data solutions], so anybody using big data to refer to business problems is quickly going to tie themselves in definitional knots.”

To which Mark Troester responded, “the hype of Hadoop is driving pressure on people to keep everything — but they ignore the difficulty in managing it.”  John Haddad then quipped that “big data is a hoarders dream,” which prompted Andy Bitterer to coin the term HoardaByte for measuring big data, and then asking, “Would the real Big Data Lebowski please stand up?”

HoardaBytes

Although it’s probably no surprise that a blogger with obsessive-compulsive in the title of his blog would like Bitterer’s new term, the fact is that whether you choose to measure it in terabytes, petabytes, exabytes, HoardaBytes, or how much reality bitterly bites, our organizations have been compulsively hoarding data for a long time.

And with silos replicating data as well as new data, and new types of data, being created and stored on a daily basis, managing all of the data is not only becoming impractical, but because we are too busy with the activity of trying to manage all of it, we are hoarding countless bytes of data without evaluating data usage, gathering data requirements, or planning for data archival.

The Big Data Lebowski

In The Big Lebowski, Jeff Lebowski (“The Dude”) is, in a classic data quality blunder caused by matching on person name only, mistakenly identified as millionaire Jeffrey Lebowski (“The Big Lebowski”) in an eccentric plot expected from a Coen brothers film, which, since its release in the late 1990s, has become a cult classic and inspired a religious following known as Dudeism.

Historically, a big part of the problem in our industry has been the fact that the word “data” is prevalent in the names we have given industry disciplines and enterprise information initiatives.  For example, data architecture, data quality, data integration, data migration, data warehousing, master data management, and data governance — to name but a few.

However, all this achieved was to perpetuate the mistaken identification of data management as an esoteric technical activity that played little more than a minor, supporting, and often uncredited, role within the business activities of our organizations.

But since the late 1990s, there has been a shift in the perception of data.  The real data deluge has not been the rising volume, variety, and velocity of data, but instead the rising awareness of the big impact that data has on nearly every aspect of our professional and personal lives.  In this brave new data world, companies like Google and Facebook have built business empires mostly out of our own personal data, which is why, like it or not, as individuals, we must accept that we are all data geeks now.

All of the hype about Big Data is missing the point.  The reality is that Data is Big — meaning that data has now so thoroughly pervaded mainstream culture that data has gone beyond being just a cult classic for the data management profession, and is now inspiring an almost religious following that we could call Dataism.

The Data must Abide

“The Dude abides.  I don’t know about you, but I take comfort in that,” remarked The Stranger in The Big Lebowski.

The Data must also abide.  And the Data must abide both the Business and the Individual.  The Data abides the Business if data proves useful to our business activities.  The Data abides the Individual if data protects the privacy of our personal activities.

The Data abides.  I don’t know about you, but I would take more comfort in that than in any solutions The Stranger Salesperson wants to sell me that utilize an eccentric sales pitch involving HoardaBytes and the Big Data Lebowski.

 

Related Posts

Dot Collectors and Dot Connectors

The attention blindness inherent in the digital age often leads to a debate about multitasking, which many claim impairs our ability to solve complex problems.  Therefore, we often hear that we need to adopt monotasking, i.e., we need to eliminate all possible distractions and focus our attention on only one task at a time.

However, during the recent Harvard Business Review podcast The Myth of Monotasking, Cathy Davidson, author of the new book Now You See It: How the Brain Science of Attention Will Transform the Way We Live, Work, and Learn, explained how “the moment that you start not paying attention fully to the task at hand, you actually start seeing other things that your attention would have missed.”  Although Davidson acknowledges that attention blindness is a serious problem, she explained that there really is no such thing as monotasking.  Modern neuroscience research has revealed that the human brain is, in fact, always multitasking.  Furthermore, she explained how multitasking can be extremely useful for a new and expansive form of attention.

“We all see selectively, but we don’t select the same things to see,” Davidson explained.  “So if we can learn to work together, we can actually account for, and productively work around, our own individual attention blindness by seeing collaboratively in a way that compensates for that blindness.”

During the podcast, an analogy was made that focusing attention on specific tasks can result in a lot of time spent collecting dots without spending enough time connecting those dots.  This point caused me to ponder the division of organizational labor that has historically existed between the dot collection of data management, which focuses on aspects such as data integrity and data quality, and the dot connection of business intelligence, which focuses on aspects such as data analysis and data visualization.

I think most data management professionals are dot collectors since it often seems like they spend a lot of their time, money, and attention on collecting (and profiling, modeling, cleansing, transforming, matching, and otherwise managing) data dots.

But since data’s value comes from data’s usefulness, merely collecting data dots doesn’t mean anything if you cannot connect those dots into meaningful patterns that enable your organization to take action or otherwise support your business activities.

So I think most business intelligence professionals are dot connectors since it often seems like they spend a lot of their time, money, and attention on connecting (and querying, aggregating, reporting, visualizing, and otherwise analyzing) data dots.

However, the attention blindness of data management and business intelligence professionals means that they see selectively, often intentionally selecting to not see the same things.  But as more of our personal and professional lives become digitized and pixelated, the big picture of the business world is inundated with the multifaceted challenges of big data, where the fast-moving large volumes of varying data are transforming the way we have to view traditional data management and business intelligence.

We need to replace our perspective of data management and business intelligence as separate monotasking activities with an expansive form of organizational multitasking where the dot collectors and dot connectors work together more collaboratively.

 

Related Posts

Channeling My Inner Beagle: The Case for Hyperactivity

Mind the Gap

The Wisdom of the Social Media Crowd

No Datum is an Island of Serendip

DQ-View: Data Is as Data Does

The Real Data Value is Business Insight

Information Overload Revisited

Neither the I Nor the T is Magic

The Big Data Collider

OCDQ Radio - Big Data and Big Analytics

OCDQ Radio - So Long 2011, and Thanks for All the . . .

The Interconnected User Interface

Scary Calendar Effects

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, recorded on the first of three occurrences of Friday the 13th in 2012, I discuss scary calendar effects.

In other words, I discuss how schedules, deadlines, and other date-related aspects can negatively affect enterprise initiatives such as data quality, master data management, and data governance.

Please Beware: This episode concludes with the OCDQ Radio Theater production of Data Quality and Friday the 13th.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

DQ-View: Data Is as Data Does

Data Quality (DQ) View is an OCDQ regular segment.  Each DQ-View is a brief video discussion of a data quality key concept.

If you are having trouble viewing this video, then you can watch it on Vimeo by clicking on this link: DQ-View on Vimeo

 

The following list contains the books shown in the video, simply listed in the order they appeared on my book shelf:

 

Previous DQ-View Videos

You can also watch a regularly updated page of my videos by clicking on this link: OCDQ Videos

DQ-View: Baseball and Data Quality

DQ-View: Occam’s Razor Burn

DQ-View: Roman Ruts on the Road to Data Governance

DQ-View: Talking about Data

DQ-View: The Poor Data Quality Blizzard

DQ-View: New Data Resolutions

DQ-View: From Data to Decision

DQ View: Achieving Data Quality Happiness

Data Quality is not a Magic Trick

DQ-View: The Cassandra Effect

DQ-View: Is Data Quality the Sun?

DQ-View: Designated Asker of Stupid Questions

Video: Oh, the Data You’ll Show!

So Long 2011, and Thanks for All the . . .

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

Don’t Panic!  Welcome to the mostly harmless OCDQ Radio 2011 Year in Review episode.  During this approximately 42 minute episode, I recap the data-related highlights of 2011 in a series of sometimes serious, sometimes funny, segments, as well as make wacky and wildly inaccurate data-related predictions about 2012.

Special thanks to my guests Jarrett Goldfedder, who discusses Big Data, Nicola Askham, who discusses Data Governance, and Daragh O Brien, who discusses Data Privacy.  Additional thanks to Rich Murnane and Dylan Jones.  And Deep Thanks to that frood Douglas Adams, who always knew where his towel was, and who wrote The Hitchhiker’s Guide to the Galaxy.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Redefining Data Quality

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, I have an occasionally spirited discussion about data quality with Peter Perera, partially precipitated by his provocative post from this past summer, The End of Data Quality...as we know it, which included his proposed redefinition of data quality, as well as his perspective on the relationship of data quality to master data management and data governance.

Peter Perera is a recognized consultant and thought leader with significant experience in Master Data Management, Customer Relationship Management, Data Quality, and Customer Data Integration.  For over 20 years, he has been advising and working with Global 5000 organizations and mid-size enterprises to increase the usability and value of their customer information.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Making EIM Work for Business

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, I discuss Enterprise Information Management (EIM) with John Ladley, the author of the excellent book Making EIM Work for Business, exploring what makes information management, not just useful, but valuable to the enterprise.

John Ladley is a business technology thought leader with 30 years of experience in improving organizations through the successful implementation of information systems.  He is a recognized authority in the use and implementation of business intelligence and enterprise information management.  John Ladley frequently writes and speaks on a variety of technology and enterprise information management topics.  His information management experience is balanced between strategic technology planning, project management, and, most important, the practical application of technology to business problems.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Two Weeks Before Christmas

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

Season’s Greetings fellow data management enthusiasts and welcome to a special holiday-themed episode of OCDQ Radio.

With the Christmas, Hanukkah, Kwanzaa, and Festivus seasons now upon us, I revisited my ‘Twas Two Weeks Before Christmas blog post from 2009, which is based on the poem A Visit from St. Nicholas.  During this brief podcast, I perform a recital.

The entire OCDQ Blog family wishes you and yours all the best during this holiday season and the coming new year.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

You only get a Return from something you actually Invest in

In my previous post, I took a slightly controversial stance on a popular three-word phrase — Root Cause Analysis.  In this post, it’s another popular three-word phrase — Return on Investment (most commonly abbreviated as the acronym ROI).

What is the ROI of purchasing a data quality tool or launching a data governance program?

Zero.  Zip.  Zilch.  Intet.  Ingenting.  Rien.  Nada.  Nothing.  Nichts.  Niets.  Null.  Niente.  Bupkis.

There is No Such Thing as the ROI of purchasing a data quality tool or launching a data governance program.

Before you hire “The Butcher” to eliminate me for being The Man Who Knew Too Little about ROI, please allow me to explain.

Returns only come from Investments

Although the reason that you likely purchased a data quality tool is because you have business-critical data quality problems, simply purchasing a tool is not an investment (unless you believe in Magic Beans) since the tool itself is not a solution.

You use tools to build, test, implement, and maintain solutions.  For example, I spent several hundred dollars on new power tools last year for a home improvement project.  However, I haven’t received any return on my home improvement investment for a simple reason — I still haven’t even taken most of the tools out of their packaging yet.  In other words, I barely even started my home improvement project.  It is precisely because I haven’t invested any time and effort that I haven’t seen any returns.  And it certainly isn’t going to help me (although it would help Home Depot) if I believed buying even more new tools was the answer.

Although the reason that you likely launched a data governance program is because you have complex issues involving the intersection of data, business processes, technology, and people, simply launching a data governance program is not an investment since it does not conjure the three most important letters.

Data is only an Asset if Data is a Currency

In his book UnMarketing, Scott Stratten discusses this within the context of the ROI of social media (a commonly misunderstood aspect of social media strategy), but his insight is just as applicable to any discussion of ROI.  “Think of it this way: You wouldn’t open a business bank account and ask to withdraw $5,000 before depositing anything. The banker would think you are a loony.”

Yet, as Stratten explained, people do this all the time in social media by failing to build up what is known as social currency.  “You’ve got to invest in something before withdrawing. Investing your social currency means giving your time, your knowledge, and your efforts to that channel before trying to withdraw monetary currency.”

The same logic applies perfectly to data quality and data governance, where we could say it’s the failure to build up what I will call data currency.  You’ve got to invest in data before you could ever consider data an asset to your organization.  Investing your data currency means giving your time, your knowledge, and your efforts to data quality and data governance before trying to withdraw monetary currency (i.e., before trying to calculate the ROI of a data quality tool or a data governance program).

If you actually want to get a return on your investment, then actually invest in your data.  Invest in doing the hard daily work of continuously improving your data quality and putting into practice your data governance principles, policies, and procedures.

Data is only an asset if data is a currency.  Invest in your data currency, and you will eventually get a return on your investment.

You only get a return from something you actually invest in.

Related Posts

Can Enterprise-Class Solutions Ever Deliver ROI?

Do you believe in Magic (Quadrants)?

Which came first, the Data Quality Tool or the Business Need?

What Data Quality Technology Wants

A Farscape Analogy for Data Quality

The Data Quality Wager

“Some is not a number and soon is not a time”

The Dumb and Dumber Guide to Data Quality

There is No Such Thing as a Root Cause

Root cause analysis.  Most people within the industry, myself included, often discuss the importance of determining the root cause of data governance and data quality issues.  However, the complex cause and effect relationships underlying an issue means that when an issue is encountered, often you are only seeing one of the numerous effects of its root cause (or causes).

In my post The Root! The Root! The Root Cause is on Fire!, I poked fun at those resistant to root cause analysis with the lyrics:

The Root! The Root! The Root Cause is on Fire!
We don’t want to determine why, just let the Root Cause burn.
Burn, Root Cause, Burn!

However, I think that the time is long overdue for even me to admit the truth — There is No Such Thing as a Root Cause.

Before you charge at me with torches and pitchforks for having an Abby Normal brain, please allow me to explain.

 

Defect Prevention, Mouse Traps, and Spam Filters

Some advocates of defect prevention claim that zero defects is not only a useful motivation, but also an attainable goal.  In my post The Asymptote of Data Quality, I quoted Daniel Pink’s book Drive: The Surprising Truth About What Motivates Us:

“Mastery is an asymptote.  You can approach it.  You can home in on it.  You can get really, really, really close to it.  But you can never touch it.  Mastery is impossible to realize fully.

The mastery asymptote is a source of frustration.  Why reach for something you can never fully attain?

But it’s also a source of allure.  Why not reach for it?  The joy is in the pursuit more than the realization.

In the end, mastery attracts precisely because mastery eludes.”

The mastery of defect prevention is sometimes distorted into a belief in data perfection, into a belief that we can not just build a better mousetrap, but we can build a mousetrap that could catch all the mice, or that by placing a mousetrap in our garage, which prevents mice from entering via the garage, we somehow also prevent mice from finding another way into our house.

Obviously, we can’t catch all the mice.  However, that doesn’t mean we should let the mice be like Pinky and the Brain:

Pinky: “Gee, Brain, what do you want to do tonight?”

The Brain: “The same thing we do every night, Pinky — Try to take over the world!”

My point is that defect prevention is not the same thing as defect elimination.  Defects evolve.  An excellent example of this is spam.  Even conservative estimates indicate almost 80% of all e-mail sent world-wide is spam.  A similar percentage of blog comments are spam, and spam generating bots are quite prevalent on Twitter and other micro-blogging and social networking services.  The inconvenient truth is that as we build better and better spam filters, spammers create better and better spam.

Just as mousetraps don’t eliminate mice and spam filters don’t eliminate spam, defect prevention doesn’t eliminate defects.

However, mousetraps, spam filters, and defect prevention are essential proactive best practices.

 

There are No Lines of Causation — Only Loops of Correlation

There are no root causes, only strong correlations.  And correlations are strengthened by continuous monitoring.  Believing there are root causes means believing continuous monitoring, and by extension, continuous improvement, has an end point.  I call this the defect elimination fallacy, which I parodied in song in my post Imagining the Future of Data Quality.

Knowing there are only strong correlations means knowing continuous improvement is an infinite feedback loop.  A practical example of this reality comes from data-driven decision making, where:

  1. Better Business Performance is often correlated with
  2. Better Decisions, which, in turn, are often correlated with
  3. Better Data, which is precisely why Better Decisions with Better Data is foundational to Business Success — however . . .

This does not mean that we can draw straight lines of causation between (3) and (1), (3) and (2), or (2) and (1).

Despite our preference for simplicity over complexity, if bad data was the root cause of bad decisions and/or bad business performance, every organization would never be profitable, and if good data was the root cause of good decisions and/or good business performance, every organization could always be profitable.  Even if good data was a root cause, not just a correlation, and even when data perfection is temporarily achieved, the effects would still be ephemeral because not only do defects evolve, but so does the business world.  This evolution requires an endless revolution of continuous monitoring and improvement.

Many organizations implement data quality thresholds to close the feedback loop evaluating the effectiveness of their data management and data governance, but few implement decision quality thresholds to close the feedback loop evaluating the effectiveness of their data-driven decision making.

The quality of a decision is determined by the business results it produces, not the person who made the decision, the quality of the data used to support the decision, or even the decision-making technique.  Of course, the reality is that business results are often not immediate and may sometimes be contingent upon the complex interplay of multiple decisions.

Even though evaluating decision quality only establishes a correlation, and not a causation, between the decision execution and its business results, it is still essential to continuously monitor data-driven decision making.

Although the business world will never be totally predictable, we can not turn a blind eye to the need for data-driven decision making best practices, or the reality that no best practice can eliminate the potential for poor data quality and decision quality, nor the potential for poor business results even despite better data quality and decision quality.  Central to continuous improvement is the importance of closing the feedback loops that make data-driven decisions more transparent through better monitoring, allowing the organization to learn from its decision-making mistakes, and make adjustments when necessary.

We need to connect the dots of better business performance, better decisions, and better data by drawing loops of correlation.

 

Decision-Data Feedback Loop

Continuous improvement enables better decisions with better data, which drives better business performance — as long as you never stop looping the Decision-Data Feedback Loop, and start accepting that there is no such thing as a root cause.

I discuss this, and other aspects of data-driven decision making, in my DataFlux white paper, which is available for download (registration required) using the following link: Decision-Driven Data Management

 

Related Posts

The Root! The Root! The Root Cause is on Fire!

Bayesian Data-Driven Decision Making

The Role of Data Quality Monitoring in Data Governance

The Circle of Quality

Oughtn’t you audit?

The Dichotomy Paradox, Data Quality and Zero Defects

The Asymptote of Data Quality

To Our Data Perfectionists

Imagining the Future of Data Quality

What going to the Dentist taught me about Data Quality

DQ-Tip: “There is No Such Thing as Data Accuracy...”

The HedgeFoxian Hypothesis

Bayesian Data-Driven Decision Making

In his book Data Driven: Profiting from Your Most Important Business Asset, Thomas Redman recounts the story of economist John Maynard Keynes, who, when asked what he does when new data is presented that does not support his earlier decision, responded: “I change my opinion.  What do you do?”

“This is the way good decision makers behave,” Redman explained.  “They know that a newly made decision is but the first step in its execution.  They regularly and systematically evaluate how well a decision is proving itself in practice by acquiring new data.  They are not afraid to modify their decisions, even admitting they are wrong and reversing course if the facts demand it.”

Since he has a PhD in statistics, it’s not surprising that Redman explained effective data-driven decision making using Bayesian statistics, which is “an important branch of statistics that differs from classic statistics in the way it makes inferences based on data.  One of its advantages is that it provides an explicit means to quantify uncertainty, both a priori, that is, in advance of the data, and a posteriori, in light of the data.”

Good decision makers, Redman explained, follow at least three Bayesian principles:

  1. They bring as much of their prior experience as possible to bear in formulating their initial decision spaces and determining the sorts of data they will consider in making the decision.
  2. For big, important decisions, they adopt decision criteria that minimize the maximum risk.
  3. They constantly evaluate new data to determine how well a decision is working out, and they do not hesitate to modify the decision as needed.

A key concept of statistical process control and continuous improvement is the importance of closing the feedback loop that allows a process to monitor itself, learn from its mistakes, and adjust when necessary.

The importance of building feedback loops into data-driven decision making is too often ignored.

I discuss this, and other aspects of data-driven decision making, in my DataFlux white paper, which is available for download (registration required) using the following link: Decision-Driven Data Management

 

Related Posts

Decision-Driven Data Management

The Speed of Decision

The Big Data Collider

A Decision Needle in a Data Haystack

The Data-Decision Symphony

Thaler’s Apples and Data Quality Oranges

Satisficing Data Quality

Data Confabulation in Business Intelligence

The Data that Supported the Decision

Data Psychedelicatessen

OCDQ Radio - Big Data and Big Analytics

OCDQ Radio - Good-Enough Data for Fast-Enough Decisions

The Circle of Quality

A Farscape Analogy for Data Quality

OCDQ Radio - Organizing for Data Quality

No Datum is an Island of Serendip

Continuing a series of blog posts inspired by the highly recommended book Where Good Ideas Come From by Steven Johnson, in this blog post I want to discuss the important role that serendipity plays in data — and, by extension, business success.

Let’s start with a brief etymology lesson.  The origin of the word serendipity, which is commonly defined as a “happy accident” or “pleasant surprise” can be traced to the Persian fairy tale The Three Princes of Serendip, whose heroes were always making discoveries of things they were not in quest of either by accident or by sagacity (i.e., the ability to link together apparently innocuous facts to come to a valuable conclusion).  Serendip was an old name for the island nation now known as Sri Lanka.

“Serendipity,” Johnson explained, “is not just about embracing random encounters for the sheer exhilaration of it.  Serendipity is built out of happy accidents, to be sure, but what makes them happy is the fact that the discovery you’ve made is meaningful to you.  It completes a hunch, or opens up a door in the adjacent possible that you had overlooked.  Serendipitous discoveries often involve exchanges across traditional disciplines.  Serendipity needs unlikely collisions and discoveries, but it also needs something to anchor those discoveries.  The challenge, of course, is how to create environments that foster these serendipitous connections.”

 

No Datum is an Island of Serendip

“No man is an island, entire of itself; every man is a piece of the continent, a part of the main.”

These famous words were written by the poet John Donne, the meaning of which is generally regarded to be that human beings do not thrive when isolated from others.  Likewise, data does not thrive in isolation.  However, many organizations persist on data isolation, on data silos created when separate business units see power in the hoarding of data, not in the sharing of data.

But no business unit is an island, entire of itself; every business unit is a piece of the organization, a part of the enterprise.

Likewise, no datum is an Island of Serendip.  Data thrives through the connections, collisions, and combinations that collectively unleash serendipity.  When data is exchanged across organizational boundaries, and shared with the entire enterprise, it enables the interdisciplinary discoveries required for making business success more than just a happy accident or pleasant surprise.

Our organizations need to create collaborative environments that foster serendipitous connections bringing all of our business units and people together around our shared data assets.  We need to transcend our organizational boundaries, reduce our data silos, and gather our enterprise’s heroes together on the Data Island of Serendip — our United Nation of Business Success.

 

Related Posts

Data Governance and the Adjacent Possible

The Three Most Important Letters in Data Governance

The Stakeholder’s Dilemma

The Data Cold War

Turning Data Silos into Glass Houses

The Good Data

DQ-BE: Single Version of the Time

My Own Private Data

Sharing Data

Are you Building Bridges or Digging Moats?

The Collaborative Culture of Data Governance

The Interconnected User Interface