Capitalizing on Big Data Analytics

If you’re having trouble viewing this video, watch it on Vimeo via this link:Capitalizing on Big Data Analytics

The following links are to content featured in this video, as well as links to other related resources:

IBM Logo.jpg

DQ-BE: Old Beer bought by Old Man

Data Quality By Example (DQ-BE) is an OCDQ regular segment that provides examples of data quality key concepts.

Over the weekend, in preparation for watching the Boston Red Sox, I bought some beer and pizza.  Later that night, after a thrilling victory that sent the Red Sox to the 2013 World Series, I was cleaning up the kitchen and was about to throw out the receipt when I couldn’t help but notice two data quality issues.

First, although I had purchased Samuel Adams Octoberfest, the receipt indicated I had bought Spring Ale, which, although it’s still available in some places and it’s still good beer, it’s three seasonal beers (Summer Ale, Winter Lager, Octoberfest) old.  This data quality issue impacts the store’s inventory and procurement systems (e.g., maybe the store orders more Spring Ale next year because people were apparently still buying it in October this year).

The second, and far more personal, data quality issue was that the age verification portion of my receipt indicated I was born on or before November 22, 1922, making me at least 91 years old!  While I am of the age (42) typical of a midlife crisis, I wasn’t driving a new red sports car, just wearing my old Red Sox sports jersey and hat.  As for the store, this data quality issue could be viewed as a regulatory compliance failure since it seems like their systems are set up by default to allow the sale of alcohol without proper age verification.  Additionally, this data quality issue might make it seem like their only alcohol-purchasing customers are very senior citizens.

 

What examples (good or poor) of data quality have you encountered?  Please share them by posting a comment below.

 

Related Posts

DQ-BE: The Time Traveling Gift Card

DQ-BE: Invitation to Duplication

DQ-BE: Dear Valued Customer

DQ-BE: Single Version of the Time

DQ-BE: Data Quality Airlines

Retroactive Data Quality

Sometimes Worse Data Quality is Better

Data Quality and the OK Plateau

When Poor Data Quality Kills

The Costs and Profits of Poor Data Quality

Secure the Engine to Your Business Future

People use mobile devices, as James Hailey Jr. blogged, “for almost everything they do in their day to day activities like listening to music, work, social applications, and calendar functions.  They allow people to immediately get information and access different resources.  In today’s world, there are more mobile devices than there have ever been in recent years and companies are just realizing the potential opportunities that exist.”

As Daniel Newman blogged, “cloud, mobile devices, Big Data, and social media have become a permanent fixture of today’s business.  From solopreneurs to global enterprises, companies are more connected than ever before to their customers, employees, shareholders, and stakeholders.  Enabled by connectivity and powered by the cloud, this is more than just Marketechture, this is the engine of our business future.”

“By embracing social tools in the cloud,” Rebecca Buisan blogged, “organizations can now attract new customers while at the same time better serve their existing clients, employees, and business partners.”

While cloud and mobile are enabling social business, it is not all blue skies and rainbows.  The age of the mobile device is still young, so as you embrace, with youthful exuberance, the convenience of the mobile-app-portal-to-the-cloud computing model, convenience should not trump security.

As Marissa Tejada blogged, despite your employees’ hands being full of business-enabling mobile devices, too few organizations are making sure mobility and security go hand in hand.  Especially when BYOD puts personal devices into business hands.

One example Allan Pratt blogged about is iOS7’s AirDrop feature, which uses a combination of Bluetooth and Wi-Fi ad-hoc networks.  “The bottom line,” Pratt explained, “is that while AirDrop may sound like a good idea in theory, it needs more security embedded into it for data transfers to be considered.  For SMBs, this means you should be wary of new technology until it has been proven safe and effective for the enterprise.  You don’t want your data walking out the door without your knowledge.”

With big data providing the 1.21 gigawatts (often with a lot more than 1.21 gigabytes) of power, social, cloud, and mobile technology is the flux capacitor driving companies of all sizes forward to the future of business.  Just as lightning never strikes twice, you don’t want to end up looking back in time, second-guessing why you didn’t secure the engine to your business future.

IBM Logo.jpg

Council Data Governance

Inspired by the great Eagles song Hotel California, this DQ-Song “sings” about the common mistake of convening a council too early when starting a new data governance program.  Now, of course, data governance is a very important and serious subject, which is why some people might question whether or not music is the best way to discuss data governance.

Although I understand that skepticism, I can’t help but recall the words of Frank Zappa:

“Information is not knowledge;

Knowledge is not wisdom;

Wisdom is not truth;

Truth is not beauty;

Beauty is not love;

Love is not music;

Music is the best.”

Council Data Governance

Down a dark deserted hallway, I walked with despair
As the warm smell of bagels rose up through the air
Up ahead in the distance, I saw a shimmering light
My head grew heavy and my sight grew dim
I had to attend another data governance council meeting
As I stood in the doorway
I heard the clang of the meeting bell

And I was thinking to myself
This couldn’t be heaven, but this could be hell
As stakeholders argued about the data governance way
There were voices down the corridor
I thought I heard them say . . .

Welcome to the Council Data Governance
Such a dreadful place (such a dreadful place)
Time crawls along at such a dreadful pace
Plenty of arguing at the Council Data Governance
Any time of year (any time of year)
You can hear stakeholders arguing there

Their agendas are totally twisted, with means to their own end
They use lots of pretty, pretty words, which I don’t comprehend
How they dance around the complex issues with sweet sounding threats
Some speak softly with remorse, some speak loudly without regrets

So I cried out to the stakeholders
Can we please reach consensus on the need for collaboration?
They said, we haven’t had that spirit here since nineteen ninety nine
And still those voices they’re calling from far away
Wake you up in the middle of this endless meeting
Just to hear them say . . .

Welcome to the Council Data Governance
Such a dreadful place (such a dreadful place)
Time crawls along at such a dreadful pace
They argue about everything at the Council Data Governance
And it’s no surprise (it’s no surprise)
To hear defending the status quo alibis

Bars on all of the windows
Rambling arguments, anything but concise
We are all just prisoners here
Of our own device
In the data governance council chambers
The bickering will never cease
They stab it with their steely knives
But they just can’t kill the beast

Last thing I remember, I was
Running for the door
I had to find the passage back
To the place I was before
Relax, said the stakeholders
We have been programmed by bureaucracy to believe
You can leave the council meeting any time you like
But success with data governance, you will never achieve!

 

More Data Quality Songs

Data Love Song Mashup

I’m Gonna Data Profile (500 Records)

A Record Named Duplicate

New Time Human Business

You Can’t Always Get the Data You Want

I’m Bringing DQ Sexy Back

Imagining the Future of Data Quality

The Very Model of a Modern DQ General

More Data Governance Posts

Beware the Data Governance Ides of March

Data Governance Star Wars: Bureaucracy versus Agility

Aristotle, Data Governance, and Lead Rulers

Data Governance needs Searchers, not Planners

Data Governance Frameworks are like Jigsaw Puzzles

Is DG a D-O-G?

The Hawthorne Effect, Helter Skelter, and Data Governance

Data Governance and the Buttered Cat Paradox

Total Information Risk Management

OCDQ Radio is an audio podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, I am joined by special guest Dr. Alexander Borek, the inventor of Total Information Risk Management (TIRM) and the leading expert on how to apply risk management principles to data management.  Dr. Borek is a frequent speaker at international information management conferences and author of many research articles covering a range of topics, including EIM, data quality, crowd sourcing, and IT business value.  In his current role at IBM, Dr. Borek applies data analytics to drive IBM’s worldwide corporate strategy.  Previously, he led a team at the University of Cambridge to develop the TIRM process and test it in a number of different industries.  He holds a PhD in engineering from the University of Cambridge.

This podcast discusses his book Total Information Risk Management: Maximizing the Value of Data and Information Assets, which is now available world-wide and is a must read for all data and information managers who want to understand and measure the implications of low quality data and information assets.  The book provides step by step instructions, along with illustrative examples from studies in many different industries, on how to implement total information risk management, which will help your organization:

  • Learn how to manage data and information for business value.

  • Create powerful and convincing business cases for all your data and information management, data governance, big data, data warehousing, business intelligence, and business analytics initiatives, projects, and programs.

  • Protect your organization from risks that arise through poor data and information assets.

  • Quantify the impact of having poor data and information.

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Demystifying Master Data Management — Guest John Owens explains the three types of data (Transaction, Domain, Master), the four master data entities (Party, Product, Location, Asset), and the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Data Profiling Early and Often — Guest James Standen discusses data profiling concepts and practices, and how bad data is often misunderstood and can be coaxed away from the dark side if you know how to approach it.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Data Storage for Midsize Businesses

If you’re having trouble viewing this video, watch it on Vimeo via this link:Data Storage for Midsize Businesses

The following links are to the infographic featured in this video, as well as links to other related resources:

IBM Logo.jpg

Is DG a D-O-G?

Is+DG+a+DOG.jpg

Convincing your organization to invest in a sustained data quality program implemented within a data governance framework can be a very difficult task requiring an advocate with a championship pedigree.  But sometimes it seems like no matter how persuasive your sales pitch is, even when your presentation is judged best in show, it appears to fall on deaf ears.

Perhaps, data governance (DG) is a D-O-G.  In other words, maybe the DG message is similar to a sound only dogs can hear.

Galton’s Whistle

In the late 19th century, Francis Galton developed a whistle (now more commonly called a dog whistle), which he used to test the range of frequencies that could be heard by various animals.  Galton was conducting experiments on human faculties, including the range of human hearing.  Although not its intended purpose, today Galton’s whistle is used by dog trainers.  By varying the frequency of the whistle, it emits a sound (inaudible to humans) used either to simply get a dog’s attention, or alternatively to inflict pain for the purpose of correcting undesirable behavior.

Bad Data, Bad, Bad Data!

Many organizations do not become aware of the importance of data governance until poor data quality repeatedly “bites” critical business decisions.  Typically following a very nasty bite, executives scream “bad data, bad, bad data!” without stopping to realize the enterprise’s poor data management practices unleashed the perpetually bad data now running amuck within their systems.

For these organizations, advocacy of proactive defect prevention was an inaudible sound, and now the executives blow harshly into their data whistle and demand a one-time data cleansing project to correct the current data quality problems.

However, even after the project is over, it’s often still a doggone crazy data world.

The Data Whisperer

Executing disconnected one-off projects to deal with data issues when they become too big to ignore doesn’t work because it doesn’t identify and correct the root causes of data’s bad behavior.  By advocating root cause analysis and business process improvement, data governance can essentially be understood as The Data Whisperer.

Data governance defines policies and procedures for aligning data usage with business metrics, establishes data stewardship, prioritizes data quality issues, and facilitates collaboration among all of the business and technical stakeholders.

Data governance enables enterprise-wide data quality by combining data cleansing (which will still occasionally be necessary) and defect prevention into a hybrid discipline, which will result in you hearing everyday tales about data so well behaved that even your executives’ tails will be wagging.

Data’s Best Friend

Without question, data governance is very disruptive to an organization’s status quo.  It requires patience, understanding, and dedication because it will require a strategic enterprise-wide transformation that doesn’t happen overnight.

However, data governance is also data’s best friend. 

And in order for your organization to be successful, you have to realize that data is also your best friend.  Data governance will help you take good care of your data, which in turn will take good care of your business.

Basically, the success of your organization comes down to a very simple question — Are you a DG person?

Data is a Game Changer

Data is a Game Changer.png

Nowadays we hear a lot of chatter, rather reminiscent of the boisterous bluster of sports talk radio debates, about the potential of big data and its related technologies to enable predictive and real-time analytics and, by leveraging an infrastructure provided by the symbiotic relationship of cloud and mobile, serve up better business performance and an enhanced customer experience.

Sports have always provided great fodder for the data-obsessed with its treasure troves of statistical data dissecting yesterday’s games down to the most minute detail, which is called upon by experts and amateurs alike to try to predict tomorrow’s games as well as analyze in real-time the play-by-play of today’s games.  Arguably, it was the bestselling book Moneyball by Michael Lewis, which was also adapted into a popular movie starring Brad Pitt, that brought data obsession to the masses, further fueling the hype and overuse of sports metaphors such as how data can be a game changer for businesses in any industry and of any size.

The Future is Now Playing on Center Court

Which is why it is so refreshing to see a tangible real-world case study for big data analytics being delivered with the force of an Andy Murray two-handed backhand as over the next two weeks the United States Tennis Association (USTA) welcomes hundreds of thousands of spectators to New York City’s Flushing Meadows for the 2013 U.S. Open tennis tournament.  Both the fans in the stands and the millions more around the world will visit USOpen.org, via the web or mobile apps, in order to follow the action, watch live-streamed tennis matches, and get scores, stats, and the latest highlights and news thanks to IBM technologies.

Before, during, and after each match, predictive and real-time analytics drive IBM’s SlamTracker tool.  Before matches, IBM analyzes 41 million data points collected from eight years of Grand Slam play, including head-to-head matches, similar player types, and playing surfaces.  SlamTracker uses this data to create engaging and compelling tools for digital audiences, which identify key actions players must take to enhance their chances of winning, and give fans player information, match statistics, social sentiment, and more.

The infrastructure that supports the U.S. Open’s digital presence is hosted on an IBM SmartCloud.  This flexible, scalable environment, managed by IBM Analytics, lets the USTA ensure continuous availability of their digital platforms throughout the tournament and year-round.  The USTA and IBM give fans the ability to experience the matches from anywhere, with any device via a mobile-friendly site and engaging apps for multiple mobile platforms.  Together these innovations make the U.S. Open experience immediate and intimate for fans sitting in the stands or on another continent.

Better Service, More Winners, and Fewer Unforced Errors

In tennis, a service (also known as a serve) is a shot to start a point.  In business, a service is a shot to start a point of positive customer interaction, whether that’s a point of sale or an opportunity to serve a customer’s need (e.g., resolving a complaint).

In tennis, a winner is a shot not reached by your opponent, which wins you a point.  In business, a winner is a differentiator not reached by your competitor, which wins your business a sale when it makes a customer choose your product or service.

In tennis, an unforced error is a failure to complete a service or return a shot, which cannot be attributed to any factor other than poor judgement or execution by the player.  In business, an unforced error is a failure to service a customer or get a return on an investment, which cannot be attributed to any factor other than poor decision making or execution by the organization.

Properly supported by enabling technologies, businesses of all sizes, and across all industries, can capture and analyze data to uncover hidden patterns and trends that can help them achieve better service, more winners, and fewer unforced errors.

How can Data change Your Game?

Whether it’s on the court, in the stands, on the customer-facing front lines, in the dashboards used by executive management, or behind the scenes of a growing midsize business, data is a game changer.  How can data change your game?

IBM Logo.jpg