Jim Harris

My name is Jim Harris, I am the Blogger-in-Chief of OCDQ Blog, and an independent consultant, speaker, and freelance writer for hire.

My Services Contact Me
Search OCDQ Blog
Recent Comments
Tuesday
May222012

Information Asymmetry versus Empowered Customers

Information asymmetry is a term from economics describing how one party involved in a transaction typically has more or better information than the other party.  Perhaps the easiest example of information asymmetry is retail sales, where historically the retailer has always had more or better information than the customer about a product that is about to be purchased.

Generally speaking, information asymmetry is advantageous for the retailer, allowing them to manipulate the customer into purchasing products that benefit the retailer’s goals (e.g., maximizing profit margins or unloading excess inventory) more than the customer’s goals (e.g., paying a fair price or buying the product that best suits their needs).  I don’t mean to demonize the retail industry, but for a long time, I’m pretty sure its unofficial motto was: “An uninformed customer is the best customer.”

Let’s consider the example of purchasing a high-definition television (HDTV) since it demonstrates how information asymmetry is not always about holding back useful information, but also bombarding customers with useless information.  In this example, it’s about bombarding customers with useless technical jargon, such as refresh rate, resolution, and contrast ratio.

To an uninformed customer, it certainly sounds like it makes sense that the HDTV with a 240Hz refresh rate, 1080p resolution, and 2,000,000:1 contrast ratio is better than the one with a 120Hz refresh rate, 720p resolution, and 1,000,000:1 contrast ratio.

After all, 240 > 120, 1080 > 720, and 2,000,000 > 1,000,000, right?  Yes — but what do any of those numbers actually mean?

The reality is that refresh rate, resolution, and contrast ratio are just three examples of useless HDTV specifications because they essentially provide no meaningful information about the video quality of the television.  This information is advantageous to only one party involved in the transaction — the retailer — since it appears to justify the higher price of an allegedly better product.

But nowadays fewer customers are falling for these tricks.  Performing a quick Internet search, either before going shopping or on their mobile phone while at the store, is balancing out some of the information asymmetry in retail sales and empowering customers to make better purchasing decisions.  With the increasing availability of broadband Internet and mobile connectivity, today’s empowered customer arrives at the retail front lines armed and ready to do battle with information asymmetry.

The empowered customer changes the balance of power in the retail industry.  Is your business ready for Smarter Commerce? Join the conversation on the IBM Mid-Market Smarter Commerce Twitter chat on Wednesday, May 23 at 2:00PM EST.  Panelists will include some of the top Mid-Market influencers in the industry.  IBM Experts, business partners, business owners and managers are all encouraged to join in, ask questions, and share their knowledge in a relaxed atmosphere.  The chat can be followed on Twitter using the hashtag #mmSCchat or log on and access the chat on twebevent: twebevent.com/mmSCchat

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

Thursday
May172012

The Data Quality Placebo

Inspired by a recent Boing Boing blog post

Are you suffering from persistent and annoying data quality issues?  Or are you suffering from the persistence of data quality tool vendors and consultants annoying you with sales pitches about how you must be suffering from persistent data quality issues?

Either way, the Data Division of Prescott Pharmaceuticals (trusted makers of gastroflux, datamine, selectium, and qualitol) is proud to present the perfect solution to all of your real and/or imaginary data quality issues — The Data Quality Placebo.

Simply take two capsules (made with an easy-to-swallow coating) every morning and you will be guaranteed to experience:

“Zero Defects with Zero Side Effects” TM

(Legal Disclaimer: Zero Defects with Zero Side Effects may be the result of Zero Testing, which itself is probably just a side effect of The Prescott Promise: “We can promise you that we will never test any of our products on animals because . . . we never test any of our products.”)

Tuesday
May152012

How Data Cleansing Saves Lives

When it comes to data quality best practices, it’s often argued, and sometimes quite vehemently, that proactive defect prevention is far superior to reactive data cleansing.  Advocates of defect prevention sometimes admit that data cleansing is a necessary evil.  However, at least in my experience, most of the time they conveniently, and ironically, cleanse (i.e., drop) the word necessary.

Therefore, I thought I would share a story about how data cleansing saves lives, which I read about in the highly recommended book Space Chronicles: Facing the Ultimate Frontier by Neil deGrasse Tyson.  “Soon after the Hubble Space Telescope was launched in April 1990, NASA engineers realized that the telescope’s primary mirror—which gathers and reflects the light from celestial objects into its cameras and spectrographs—had been ground to an incorrect shape.  In other words, the two-billion dollar telescope was producing fuzzy images.  That was bad.  As if to make lemonade out of lemons, though, computer algorithms came to the rescue.  Investigators at the Space Telescope Science Institute in Baltimore, Maryland, developed a range of clever and innovative image-processing techniques to compensate for some of Hubble’s shortcomings.”

In other words, since it would be three years before Hubble’s faulty optics could be repaired during a 1993 space shuttle mission, data cleansing allowed astrophysicists to make good use of Hubble despite the bad data quality of its early images.

So, data cleansing algorithms saved Hubble’s fuzzy images — but how did this data cleansing actually save lives?

“Turns out,” Tyson explained, “maximizing the amount of information that could be extracted from a blurry astronomical image is technically identical to maximizing the amount of information that can be extracted from a mammogram.  Soon the new techniques came into common use for detecting early signs of breast cancer.”

“But that’s only part of the story.  In 1997, for Hubble’s second servicing mission, shuttle astronauts swapped in a brand-new, high-resolution digital detector—designed to the demanding specifications of astrophysicists whose careers are based on being able to see small, dim things in the cosmos.  That technology is now incorporated in a minimally invasive, low-cost system for doing breast biopsies, the next stage after mammograms in the early diagnosis of cancer.”

Even though defect prevention was eventually implemented to prevent data quality issues in Hubble’s images of outer space, those interim data cleansing algorithms are still being used today to help save countless human lives here on Earth.

So, at least in this particular instance, we have to admit that data cleansing is a necessary good.

 

Related Posts

Hyperactive Data Quality (Second Edition)

A Tale of Two Q’s

What going to the dentist taught me about data quality

Paleolithic Rhythm and Data Quality

Groundhog Data Quality Day

The Dichotomy Paradox, Data Quality and Zero Defects

The Asymptote of Data Quality

To Our Data Perfectionists

Finding Data Quality

Data Quality and The Middle Way

There is No Such Thing as a Root Cause

Data Quality and Miracle Exceptions

Thursday
May102012

The Diffusion of the Consumerization of IT

This blog post is sponsored by the Enterprise CIO Forum and HP.

On a previous post about the consumerization of IT, Paul Calento commented: “Clearly, it’s time to move IT out of a discrete, defined department and out into the field, even more than already.  Likewise, solutions used to power an organization need to do the same thing.  Problem is, though, that it’s easy to say that embedding IT makes sense (it does), but there’s little experience with managing it (like reporting and measurement).  Services integration is a goal, but cross-department, cross-business-unit integration remains a thorn in the side of many attempts.”

Embedding IT does make sense, and not only is it easier said than done, let alone done well, but part of the problem within many organizations is that IT became partially self-embedded within some business units while the IT department was resisting the consumerization of IT because they treated it like a fad and not an innovation.  And now those business units are resisting the efforts of the redefined IT department because they fear losing the IT capabilities that consumerization has already given them.

This growing IT challenge brings to mind the Diffusion of Innovations theory developed by Everett Rogers for describing the five stages for the rate at which innovations (e.g., new ideas or technology trends) spread within cultures, such as organizations, starting with the Innovators and Early Adopters, progressing through the Early and Late Majority, and trailed by the Laggards.

A related concept called Crossing the Chasm was developed by Geoffrey Moore to describe the critical phenomenon occurring when enough of the Early Adopters have embraced the innovation so that the beginning of the Early Majority becomes an almost certainty even though mainstream adoption of the innovation is still far from guaranteed.

From my perspective, traditional IT departments are just now crossing the chasm of the diffusion of the consumerization of IT, and are conflicting with the business units that crossed the chasm long ago with their direct adoption of cloud computingSaaS, and mobility solutions not provided by the IT department.  This divergence caused by the IT department and some business units being on different sides of the chasm has damaged, and potentially irreparably, some aspects of the IT-Business partnership.

The longer the duration of this divergence, the more difficult it will be for an IT department, that has finally crossed the chasm, to redefine their role and remain relevant partners with those business units that, perhaps for the first time in the organization’s history, were ahead of the information technology adoption curve.  Additionally, even the communication and collaboration across business units is negatively affected by different business units crossing the IT consumerization chasm at different times, which often, as Paul Calento noted, complicates the organization’s attempts to integrate cross-business-unit IT services.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

Serving IT with a Side of Hash Browns

The IT Consumerization Conundrum

The IT Prime Directive of Business First Contact

The UX Factor

A Swift Kick in the AAS

Shadow IT and the New Prometheus

The Diderot Effect of New Technology

Are Cloud Providers the Bounty Hunters of IT?

The IT Pendulum and the Federated Future of IT

Suburban Flight, Technology Sprawl, and Garage IT

Tuesday
May082012

Data Quality and the Q Test

In psychology, there’s something known as the Q Test, which asks you to use one of your fingers to trace an upper case letter Q on your forehead.  Before reading this blog post any further, please stop and perform the Q Test on your forehead right now.

 

Essentially, there’s only two ways you can complete the Q Test, which are differentiated by how you trace the tail of the Q.  Most people start by tracing a letter O, and then complete the Q by tracing its tail either toward their right eye or toward their left eye.

If you trace the tail of the Q toward your right eye, you’re imagining what a letter Q would look like from your perspective.  But if you trace the tail of the Q toward your left eye, you’re imagining what it would look like from the perspective of another person.

Basically, the point of the Q Test is to determine whether or not you have a natural tendency to consider the perspective of others.

Although considering the perspective of others is a positive under different circumstances, if you traced the letter Q with its tail toward your left eye, psychologists say that you failed the Q Test since it reveals a negative — you’re a good liar.  The reason why is that you have to be good at considering the perspective of others in order to be good at deceiving them with a believable lie.

So, as I now consider your perspective, dear reader, I bet you’re wondering: What does the Q Test have to do with data quality?

Like truth, beauty, and art, data quality can be said to be in the eyes of the beholder, or when data quality is defined, as it most often is, as fitness for the purpose of use — the eyes of the user.  But since most data has both multiple uses and users, data fit for the purpose of one use or user may not be fit for the purpose of other uses and users.  However, these multiple perspectives are considered irrelevant from the perspective of an individual user, who just needs quality data fit for the purpose of their own use.

The good news is that when it comes to data quality, most of us pass the Q Test, which means we’re not good liars.  The bad news is that since most of us pass the Q Test, we’re often only concerned about our own perspective about data quality, which is why so many organizations struggle to define data quality standards.

At the next discussion about your organization’s data quality standards, try inviting the participants to perform the Q Test.

 

Related Posts

The Point of View Paradox

You Say Potato and I Say Tater Tot

Data Myopia and Business Relativity

Beyond a “Single Version of the Truth”

DQ-BE: Single Version of the Time

Data and the Liar’s Paradox

The Fourth Law of Data Quality

Plato’s Data

Once Upon a Time in the Data

The Idea of Order in Data

Hell is other people’s data

Song of My Data

 

Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Redefining Data Quality — Guest Peter Perera discusses his proposed redefinition of data quality, as well as his perspective on the relationship of data quality to master data management and data governance.
  • Organizing for Data Quality — Guest Tom Redman (aka the “Data Doc”) discusses how your organization should approach data quality, including his call to action for your role in the data revolution.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Friday
May042012

Two Flaws in the “Fail Faster” Philosophy

There are many who advocate that the key to success, especially with innovation, is what’s known as the “fail faster” philosophy, which says that not only should we embrace new ideas and try new things without being overly concerned with failure, but, more importantly, we should effectively fail as efficiently as possible in order to expedite learning valuable lessons from our failure.

However, I have often experienced what I see as two fundamental flaws in the “fail faster” philosophy:

  1. It requires that you define failure
  2. It requires that you admit when you have failed

Most people — myself included — often fail both of these requirements.  Most people do not define failure, but instead assume that they will be successful (even though they conveniently do not define success either).  But even when people define failure, they often refuse to admit when they have failed.  In the face of failure, most people either redefine failure or extend the deadline (perhaps we should call it the fail line?) for when they will have to admit that they have failed.

We are often regaled with stories of persistence in spite of repeated failure, such as Thomas Edison’s famous remark:

“Many of life’s failures are people who did not realize how close they were to success when they gave up.”

Edison also remarked that he didn’t invent one way to make a lightbulb, but instead he invented more than 1,000 ways how not to make a lightbulb.  Each of those failed prototypes for a commercially viable lightbulb was instructive and absolutely essential to his eventual success.  But what if Edison had refused to define and admit failure?  How would he have known when to abandon one prototype and try another?  How would he have been able to learn valuable lessons from his repeated failure?

Josh Linkner recently blogged about failure being the dirty little secret of so-called overnight success, citing several examples, including Rovio (makers of the Angry Birds video game), Dyson vacuum cleaners, and WD-40.

Although these are definitely inspiring success stories, my concern is that often the only failure stories we hear are about people and companies that became famous for eventually succeeding.  In other words, we often hear eventually successful stories, and we almost never hear, or simply choose to ignore, the more common, and perhaps more useful, cautionary tales of abject failure.

It seems we have become so obsessed with telling stories that we have relegated both failure and success to the genre of fiction, which I fear is preventing us from learning any fact-based, and therefore truly valuable, lessons about failure and success.

 

Related Posts

The Winning Curve

Persistence

Mistake Driven Learning

The Fragility of Knowledge

The Wisdom of Failure

Thursday
Apr192012

Talking Business about the Weather

Businesses of all sizes are always looking for ways to increase revenue, decrease costs, and operate more efficiently.  When I talk with midsize business owners, I hear the typical questions.  Should we hire a developer to update our website and improve our SEO rankings?  Should we invest less money in traditional advertising and invest more time in social media?  After discussing these and other business topics for a while, we drift into that standard conversational filler — talking about the weather.

But since I am always interested in analyzing data from as many different perspectives as possible, when I talk about the weather, I ask midsize business owners how much of a variable the weather plays in their business.  Does the weather affect the number of customers that visit your business on a daily basis?  Do customers purchase different items when the weather is good versus bad?

I usually receive quick responses, but when I ask if those responses were based on analyzing sales data alongside weather data, the answer is usually no, which is understandable since businesses are successful when they can focus on their core competencies, and for most businesses, analytics is not a core competency.  The demands of daily operations often prevent midsize businesses from stepping back and looking at things differently, like whether or not there’s a hidden connection between weather and sales.

One of my favorite books is Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Steven Levitt and Stephen Dubner.  The book, as well as its sequel, podcast, and movie, provides good examples of one of the common challenges facing data science, and more specifically predictive analytics since its predictions often seem counterintuitive to business leaders, whose intuition is rightfully based on their business expertise, which has guided their business success to date.  The reality is that even organizations that pride themselves on being data driven naturally resist any counterintuitive insights found in their data.

Dubner was recently interviewed by Crysta Anderson about how organizations can find insights in their data if they are willing and able to ask good questions.  Of course, it’s not always easy to determine what a good question would be.  But sometimes something as simple as talking about the weather when you’re talking business could lead to a meaningful business insight.

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

Tuesday
Apr172012

Solvency II and Data Quality

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, Ken O’Connor and I discuss the Solvency II standards for data quality, and how its European insurance regulatory requirement of “complete, appropriate, and accurate” data represents common sense standards for all businesses.

Ken O’Connor is an independent data consultant with over 30 years of hands-on experience in the field, specializing in helping organizations meet the data quality management challenges presented by data-intensive programs such as data conversions, data migrations, data population, and regulatory compliance such as Solvency II, Basel II / III, Anti-Money Laundering, the Foreign Account Tax Compliance Act (FATCA), and the Dodd–Frank Wall Street Reform and Consumer Protection Act.

Ken O’Connor also provides practical data quality and data governance advice on his popular blog at: kenoconnordata.com

 

Solvency II and Data Quality

Additional listening options:

 

Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • Organizing for Data Quality — Guest Tom Redman (aka the “Data Doc”) discusses how your organization should approach data quality, including his call to action for your role in the data revolution.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.
  • The Fall Back Recap Show — A look back at the Best of OCDQ Radio, including discussions about Data, Information, Business-IT Collaboration, Change Management, Big Analytics, Data Governance, and the Data Revolution.

Thursday
Apr122012

Pitching Perfect Data Quality

In my previous post, I used a baseball metaphor to explain why we should strive for a quality start to our business activities by starting them off with good data quality, thereby giving our organization a better chance to succeed.

Since it’s a beautiful week for baseball metaphors, let’s post two!  (My apologies to Ernie Banks.)

If good data quality gives our organization a better chance to succeed, then it seems logical to assume that perfect data quality would give our organization the best chance to succeed.  However, as Yogi Berra said: “If the world were perfect, it wouldn’t be.”

My previous baseball metaphor was based on a statistic that measured how well a starting pitcher performs during a game.  The best possible performance of a starting pitcher is called a perfect game, when nine innings are perfectly completed by retiring the minimum of 27 opposing batters without allowing any hits, walks, hit batsmen, or batters reaching base due to a fielding error.

Although a lot of buzz is generated when a pitcher gets close to pitching a perfect game (e.g., usually after five perfect innings, it’s all the game’s announcers will talk about), during the 143 years of Major League Baseball history, during which approximately 200,000 games have been played, there have been only 20 perfect games, making it one of the rarest statistical events in baseball.

When a pitcher loses the chance of pitching a perfect game, does his team forfeit the game?  No, of course not.  Because the pitcher’s goal is not pitching perfectly.  The pitcher’s (and every other player’s) goal is helping the team win the game.

This is why I have never been a fan of anyone who is pitching perfect data quality, i.e., anyone advocating data perfection as the organization’s goal.  The organization’s goal is business success.  Data quality has a role to play, but claiming business success is impossible without having perfect data quality is like claiming winning in baseball is impossible without pitching a perfect game.

 

Related Posts

DQ-View: Baseball and Data Quality

The Dichotomy Paradox, Data Quality and Zero Defects

The Asymptote of Data Quality

To Our Data Perfectionists

Data Quality and The Middle Way

There is No Such Thing as a Root Cause

OCDQ Radio - The Johari Window of Data Quality

Data Quality and Miracle Exceptions

Data Quality: Quo Vadimus?

Tuesday
Apr102012

Quality Starts and Data Quality

This past week was the beginning of the 2012 Major League Baseball (MLB) season.  Since its data is mostly transaction data describing the statistical events of games played, baseball has long been a sport obsessed with statistics.  Baseball statisticians slice and dice every aspect of past games attempting to discover trends that could predict what is likely to happen in future games.

There are too many variables involved in determining which team will win a particular game to be able to choose a single variable that predicts game results.  But a few key statistics are cited by baseball analysts as general guidelines of a team’s potential to win.

One such statistic is a quality start, which is defined as a game in which a team’s starting pitcher completes at least six innings and permits no more than three earned runs.  Of course, a so-called quality start is no guarantee that the starting pitcher’s team will win the game.  But the relative reliability of the statistic to predict a game’s result causes some baseball analysts to refer to a loss suffered by a pitcher in a quality start as a tough loss and a win earned by a pitcher in a non-quality start as a cheap win.

There are too many variables involved in determining if a particular business activity will succeed to be able to choose a single variable that predicts business results.  But data quality is one of the general guidelines of an organization’s potential to succeed.

As Henrik Liliendahl Sørensen blogged, organizations are capable of achieving success with their business activities despite bad data quality, which we could call the business equivalent of cheap wins.  And organizations are also capable of suffering failure with their business activities despite good data quality, which we could call the business equivalent of tough losses.

So just like a quality start is no guarantee of a win in baseball, good data quality is no guarantee of a success in business.

But perhaps the relative reliability of data quality to predict business results should influence us to at least strive for a quality start to our business activities by starting them off with good data quality, thereby giving our organization a better chance to succeed.

 

Related Posts

DQ-View: Baseball and Data Quality

Poor Quality Data Sucks

Fantasy League Data Quality

There is No Such Thing as a Root Cause

Data Quality: Quo Vadimus?

OCDQ Radio - The Johari Window of Data Quality

OCDQ Radio - Redefining Data Quality

OCDQ Radio - The Blue Box of Information Quality

OCDQ Radio - Studying Data Quality

OCDQ Radio - Organizing for Data Quality

Thursday
Apr052012

Will Big Data be Blinded by Data Science?

All of the hype about Big Data is also causing quite the hullabaloo about hiring Data Scientists in order to help your organization derive business value from big data analytics.  But even though we are still in the hype and hullabaloo stages, these unrelenting trends are starting to rightfully draw the attention of businesses of all sizes.  After all, the key word in big data isn’t big, because, in our increasing data-constructed world, big data is no longer just for big companies and high-tech firms.

And since the key word in data scientist isn’t data, in this post I want to focus on the second word in today’s hottest job title.

When I think of a scientist of any kind, I immediately think of the scientific method, which has been the standard operating procedure of scientific discovery since the 17th century.  First, you define a question, gather some initial data, and form a hypothesis, which is some idea about how to answer your question.  Next, you perform an experiment to test the hypothesis, during which more data is collected.  Then, you analyze the experimental data and evaluate your results.  Whether or not the experiment confirmed or contradicted your hypothesis, you do the same thing — repeat the experiment.  Because a hypothesis can only be promoted to a theory after repeated experimentation (including by others) consistently produces the same result.

During experimentation, failure happens just as often as, if not more often than, success.  However, both failure and success have long played an important role in scientific discovery because progress in either direction is still progress.

Therefore, experimentation is an essential component of scientific discovery — and data science is certainly no exception.

“Designed experiments,” Melinda Thielbar recently blogged, “is where we’ll make our next big leap for data science.”  I agree, but with the notable exception of A/B testing in marketing, most business activities generally don’t embrace data experimentation.

“The purpose of science,” Tom Redman recently explained, “is to discover fundamental truths about the universe.  But we don’t run our businesses to discover fundamental truths.  We run our businesses to serve a customer, gain marketplace advantage, or make money.”  In other words, the commercial application of science has more to do with commerce than it does with science.

One example of the challenges inherent in the commercial application of science is the misconception that predictive analytics can predict what is going to happen with certainty.  When instead, what it actually does is predict some of the possible things that could happen with a certain probability.  Although predictive analytics can be a valuable tool for many business activities, especially decision making, as Steve Miller recently blogged, most of us are not good at using probabilities to make decisions.

So, with apologies to Thomas Dolby, I can’t help but wonder, will big data be blinded by data science?  Will the business leaders being told to hire data scientists to derive business value from big data analytics be blind to what data science tries to show them?

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

Tuesday
Apr032012

The Data Governance Imperative

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, Steve Sarsfield and I discuss how data governance is about changing the hearts and minds of your company to see the value of data quality, the characteristics of a data champion, and creating effective data quality scorecards.

Steve Sarsfield is a leading author and expert in data quality and data governance.  His book The Data Governance Imperative is a comprehensive exploration of data governance focusing on the business perspectives that are important to data champions, front-office employees, and executives.  He runs the Data Governance and Data Quality Insider, which is an award-winning and world-recognized blog.  Steve Sarsfield is the Product Marketing Manager for Data Governance and Data Quality at Talend.

 

The Data Governance Imperative

Additional listening options:

 

Win a copy of the Book

Steve Sarsfield wants to give one OCDQ Radio listener a free copy of The Data Governance Imperative

 

Here is how the book contest will work:

 

(1) Book Contest Question — Name at least one of the characteristics of a data champion that Steve Sarsfield described during this OCDQ Radio episode.

 

(2) Book Contest Deadline — By or before April 30, 2012, Email Jim Harris with your answer to the book contest question.

 

(3) Book Contest Winner — In May 2012, one winner will be randomly selected from the emails containing the correct answer to the contest question, and Steve Sarsfield (or his publisher) will email the winner requesting a shipping address for the book.

 

Related Posts

Data Governance and Data Quality

MacGyver: Data Governance and Duct Tape

Data Governance Frameworks are like Jigsaw Puzzles

The Three Most Important Letters in Data Governance

Data Governance and the Adjacent Possible

Data Governance Star Wars: Balancing Bureaucracy and Agility

Beware the Data Governance Ides of March

Aristotle, Data Governance, and Lead Rulers

Data Governance and the Buttered Cat Paradox

The Data Governance Oratorio

Video: Declaration of Data Governance

The Collaborative Culture of Data Governance

 

Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.

Thursday
Mar292012

What is Weighing Down your Data?

On July 21, 1969, Neil Armstrong spoke the instantly famous words “that’s one small step for man, one giant leap for mankind” as he stepped off the ladder of the Apollo Lunar Module and became the first human being to walk on the surface of the Moon.

In addition to its many other, and more significant, scientific milestones, the Moon landing provided an excellent demonstration of three related, and often misunderstood, scientific concepts: mass, weight, and gravity.

Mass is an intrinsic property of matter, based on the atomic composition of a given object, such as your body for example, which means your mass would therefore remain the same regardless of whether you were walking on the surface of the Moon or Earth.

Weight is not an intrinsic property of matter, but is instead a gravitational force acting on matter.  Because the gravitational force of the Moon is less than the gravitational force of the Earth, you would weigh less on the Moon than you weigh on the Earth.  So, just like Neil Armstrong, your one small step on the surface of the Moon could quite literally become a giant leap.

Using these concepts metaphorically, mass is an intrinsic property of data, and perhaps a way to represent objective data quality, whereas weight is a gravitational force acting on data, and perhaps a way to represent subjective data quality.

Since most data can not escape the gravity of its application, most of what we refer to as data silos are actually application silos because data and applications become tightly coupled due to the strong gravitational force that an application exerts on its data.

Now, of course, an application can exert a strong gravitational force for a strong business reason (e.g., protecting sensitive data), and not, as we often assume by default, for a weak business reason (e.g., protecting corporate political power).

Although you probably don’t view your applications as something that is weighing down your data, and you probably also resist the feeling of weightlessness that can be caused by openly sharing your data, it’s worth considering that whether or not your data truly enables your organization to take giant leaps, not just small steps, depends on the gravitational forces acting on your data.

What is weighing down your data could also be weighing down your organization.

 

Related Posts

Data Myopia and Business Relativity

Are Applications the La Brea Tar Pits for Data?

Hell is other people’s data

My Own Private Data

No Datum is an Island of Serendip

Turning Data Silos into Glass Houses

Sharing Data

The Data Outhouse

The Good Data

Beyond a “Single Version of the Truth”

Monday
Mar262012

Serving IT with a Side of Hash Browns

This blog post is sponsored by the Enterprise CIO Forum and HP.

Since it’s where I started my career, I often ponder what it would be like to work in the IT department today.  This morning, instead of sitting in a cubicle with no window view other than the one Bill Gates gave us, I’m sitting in a booth by a real window, albeit one with a partially obstructed view of the parking lot, at a diner eating a two-egg omelette with a side of hash browns.

But nowadays, it’s possible that I’m still sitting amongst my fellow IT workers.  Perhaps the older gentleman to my left is verifying last night’s database load using his laptop.  Maybe the younger woman to my right is talking into her Bluetooth earpiece with a business analyst working on an ad hoc report.  And the couple in the corner could be struggling to understand the technology requirements of the C-level executive they’re meeting with, who’s now vocalizing his displeasure about sitting in the high chair.

It’s possible that everyone thinks I am updating the status of an IT support ticket on my tablet based on the mobile text alert I just received.  Of course, it’s also possible that all of us are just eating breakfast while I’m also writing this blog post about IT.

However, as Joel Dobbs recently blogged, the IT times are a-changin’ — and faster than ever before since, thanks to the two-egg IT omelette of mobile technologies and cloud providers, IT no longer only happens in the IT department.  IT is everywhere now.

“There is a tendency to compartmentalize various types of IT,” Bruce Guptill recently blogged, “in order to make them more understandable and conform to budgeting practices.  But the core concept/theme/result of mobility really is ubiquity of IT — the same technology, services, and capabilities regardless of user and asset location.”

Regardless of how much you have embraced the consumerization of IT, some of your IT happens outside of your IT department, and some IT tasks are performed by people who not only don’t work in IT, but possibly don’t even work for your organization.

“While systems integration was once the big concern,” Judy Redman recently blogged, “today’s CIOs need to look to services integration.  Companies today need to obtain services from multiple vendors so that they can get best-of-breed solutions, cost efficiencies, and the flexibility needed to meet ever-changing and ever-more-demanding business needs.”

With its increasingly service-oriented and ubiquitous nature, it’s not too far-fetched to imagine that in the near future of IT, the patrons of a Wi-Fi-enabled diner could be your organization’s new IT department, serving your IT with a side of hash browns.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

The IT Consumerization Conundrum

Shadow IT and the New Prometheus

A Swift Kick in the AAS

The UX Factor

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

Are Applications the La Brea Tar Pits for Data?

Why does the sun never set on legacy applications?

The IT Pendulum and the Federated Future of IT

Suburban Flight, Technology Sprawl, and Garage IT

Thursday
Mar222012

Our Increasingly Data-Constructed World

Last week, I joined fellow Information Management bloggers Art Petty, Mark Smith, Bruce Guptill, and co-hosts Eric Kavanagh and Jim Ericson for a DM Radio discussion about the latest trends and innovations in the information management industry.

For my contribution to the discussion, I talked about the long-running macro trend underlying many trends and innovations, namely that our world is becoming, not just more data-driven, but increasingly data-constructed.

Physicist John Archibald Wheeler contemplated how the bit is a fundamental particle, which, although insubstantial, could be considered more fundamental than matter itself.  He summarized this viewpoint in his pithy phrase “It from Bit” explaining how: “every it — every particle, every field of force, even the space-time continuum itself — derives its function, its meaning, its very existence entirely — even if in some contexts indirectly — from the answers to yes-or-no questions, binary choices, bits.”

In other words, we could say that the physical world is conceived of in, and derived from, the non-physical world of data.

Although bringing data into the real world has historically also required constructing other physical things to deliver data to us, more of the things in the physical world are becoming directly digitized.  As just a few examples, consider how we’re progressing:

  • From audio delivered via vinyl records, audio tapes, CDs, and MP3 files (and other file formats) to Web-streaming audio
  • From video delivered via movie reels, video tapes, DVDs, and MP4 files (and other file formats) to Web-streaming video
  • From text delivered via printed newspapers, magazines, and books to websites, blogs, e-books, and other electronic texts

Furthermore, we continue to see more physical tools (e.g., calculators, alarm clocks, calendars, dictionaries) transforming into apps and data on our smart phones, tablets, and other mobile devices.  Essentially, in a world increasingly constructed of an invisible and intangible substance called data (perhaps the datum should be added to the periodic table of elements?), one of the few things that we see and touch are the screens of our mobile devices that make the invisible visible and the intangible tangible.

 

Bitrate, Lossy Audio, and Quantity over Quality

If our world is becoming increasingly data-constructed, does that mean people are becoming more concerned about data quality?

In a bit, 0.  In a word, no.  And that’s because, much to the dismay of those working in the data quality profession, most people do not care about the quality of their data unless it becomes bad enough for them to pay attention to — and complain about.

An excellent example is bitrate, which refers to the number of bits — or the amount of data — that are processed over a certain amount of time.  In his article Does Bitrate Really Make a Difference In My Music?, Whitson Gordon examined the common debate about lossless and lossy audio formats.

Using the example of ripping a track from a CD to a hard drive, a lossless format means that the track is not compressed to the point where any of its data is lost, retaining, for all intents and purposes, the same audio data quality as the original CD track.

By contrast, a lossy format compresses the track so that it takes up less space by intentionally removing some of its data, thereby reducing audio data quality.  Audiophiles often claim anything other than vinyl records sound lousy because they are so lossy.

However, like truth, beauty, and art, data quality can be said to be in the eyes — or the ears — of the beholder.  So, if your favorite music sounds good enough to you in MP3 file format, then not only do you not need those physical vinyl records, audio tapes, and CDs anymore, but since you consider MP3 files good enough, you will not pay any further attention to audio data quality.

Another, and less recent, example is the videotape format war waged during the 1970s and 1980s between Betamax and VHS, when Betamax was widely believed to provide superior video data quality.

But a blank Betamax tape allowed users to record up to two hours of high-quality video, whereas a VHS tape allowed users to record up to four hours of slightly lower quality video.  Consumers consistently chose quantity over quality — and especially since lower quality also meant a lower price.  Betamax tapes and machines remained more expensive based on the assumption that consumers would pay a premium for higher quality video.

The VHS victory demonstrated how people often choose quantity over quality, so it doesn’t always pay to have better data quality.

 

Redefining Structure in a Data-Constructed World

Another side effect of our increasingly data-constructed world is that it is challenging the traditional data management notion that data has to be structured before it can be used — especially within many traditional notions of business intelligence.

Physicist Niels Bohr suggested that understanding the structure of the atom requires changing our definition of understanding.

Since a lot of the recent Big Data craze consists of unstructured or semi-structured data, perhaps understanding how much structure data truly requires for business applications (e.g., sentiment analysis of social networking data) requires changing our definition of structuring.  At the very least, we have to accept the fact that the relational data model is no longer our only option.

Although I often blog about how data and the real world are not the same thing, as more physical things, as well as more aspects of our everyday lives, become directly digitized, it is becoming more difficult to differentiate physical reality from digital reality.

 

Related Posts

HoardaBytes and the Big Data Lebowski

Magic Elephants, Data Psychics, and Invisible Gorillas

Big Data el Memorioso

The Big Data Collider

Information Overload Revisited

Dot Collectors and Dot Connectors

WYSIWYG and WYSIATI

Plato’s Data

The Data Cold War

A Farscape Analogy for Data Quality

 

Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • A Brave New Data World — A discussion about how data, data quality, data-driven decision making, and metadata quality no longer reside exclusively within the esoteric realm of data management — basically, everyone is a data geek now.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.