Demystifying Master Data Management

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, special guest John Owens and I attempt to demystify master data management (MDM) by explaining the three types of data (Transaction, Domain, Master) and the four master data entities (Party, Product, Location, Asset), as well as, and perhaps the most important concept of all, the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).

John Owens is a thought leader, consultant, mentor, and writer in the worlds of business and data modelling, data quality, and master data management (MDM).  He has built an international reputation as a highly innovative specialist in these areas and has worked in and led multi-million dollar projects in a wide range of industries around the world.

John Owens has a gift for identifying the underlying simplicity in any enterprise, even when shrouded in complexity, and bringing it to the surface.  He is the creator of the Integrated Modelling Method (IMM), which is used by business and data analysts around the world.  Later this year, John Owens will be formally launching the IMM Academy, which will provide high quality resources, training, and mentoring for business and data analysts at all levels.

You can also follow John Owens on Twitter and connect with John Owens on Linkedin.  And if you’re looking for a MDM course, consider the online course from John Owens, which you can find by clicking on this link: MDM Online Course (Affiliate Link)

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Commendable Comments (Part 13)

Welcome to the 400th Obsessive-Compulsive Data Quality (OCDQ) blog post!  I am commemorating this milestone with the 13th entry in my ongoing series for expressing gratitude to my readers for their truly commendable comments on my blog posts.

 

Commendable Comments

On Will Big Data be Blinded by Data Science?, Meta Brown commented:

“Your concern is well-founded. Knowing how few businesses make really good use of the small data they’ve had around all along, it’s easy to imagine that they won’t do any better with bigger data sets.

I wrote some hints for those wallowing into the big data mire in my post, Better than Brute Force: Big Data Analytics Tips. But the truth is that many organizations won’t take advantage of the ideas that you are presenting, or my tips, especially as the datasets grow larger. That’s partly because they have no history in scientific methods, and partly because the data science movement is driving employers to search for individuals with heroically large skill sets.

Since few, if any, people truly meet these expectations, those hired will have real human limitations, and most often they will be people who know much more about data storage and manipulation than data analysis and applications.”

On Will Big Data be Blinded by Data Science?, Mike Urbonas commented:

“The comparison between scientific inquiry and business decision making is a very interesting and important one. Successfully serving a customer and boosting competitiveness and revenue does require some (hopefully unique) insights into customer needs. Where do those insights come from?

Additionally, scientists also never stop questioning and improving upon fundamental truths, which I also interpret as not accepting conventional wisdom — obviously an important trait of business managers.

I recently read commentary that gave high praise to the manager utilizing the scientific method in his or her decision-making process. The author was not a technologist, but rather none other than Peter Drucker, in writings from decades ago.

I blogged about Drucker’s commentary, data science, the scientific method vs. business decision making, and I’d value your and others’ input: Business Managers Can Learn a Lot from Data Scientists.”

On Word of Mouth has become Word of Data, Vish Agashe commented:

“I would argue that listening to not only customers but also business partners is very important (and not only in retail but in any business). I always say that, even if as an organization you are not active in the social world, assume that your customers, suppliers, employees, competitors are active in the social world and they will talk about you (as a company), your people, products, etc.

So it is extremely important to tune in to those conversations and evaluate its impact on your business. A dear friend of mine ventured into the restaurant business a few years back. He experienced a little bit of a slowdown in his business after a great start. He started surveying his customers, brought in food critiques to evaluate if the food was a problem, but he could not figure out what was going on. I accidentally stumbled upon Yelp.com and noticed that his restaurant’s rating had dropped and there were some complaints recently about services and cleanliness (nothing major though).

This happened because he had turnover in his front desk staff. He was able to address those issues and was able to reach out to customers who had bad experience (some of them were frequent visitors). They were able to go back and comment and give newer ratings to his business. This helped him with turning the corner and helped with the situation.

This was a big learning moment for me about the power of social media and the need for monitoring it.”

On Data Quality and the Bystander Effect, Jill Wanless commented:

“Our organization is starting to develop data governance processes and one of the processes we have deliberately designed is to get to the root cause of data quality issues.

We’ve designed it so that the errors that are reported also include the userid and the system where the data was generated. Errors are then filtered by function and the business steward responsible for that function is the one who is responsible for determining and addressing the root cause (which of course may require escalation to solve).

The business steward for the functional area has the most at stake in the data and is typically the most knowledgeable as to the process or system that may be triggering the error. We have yet to test this as we are currently in the process of deploying a pilot stewardship program.

However, we are very confident that it will help us uncover many of the causes of the data quality problems and with lots of PLAN, DO, CHECK, and ACT, our goal is to continuously improve so that our need for stewardship eventually (many years away no doubt) is reduced.”

On The Return of the Dumb Terminal, Prashanta Chandramohan commented:

“I can’t even imagine what it’s like to use this iPad I own now if I am out of network for an hour. Supposedly the coolest thing to own and a breakthrough innovation of this decade as some put it, it’s nothing but a dumb terminal if I do not have 3G or Wi-Fi connectivity.

Putting most of my documents, notes, to-do’s, and bookmarked blogs for reading later (e.g., Instapaper) in the cloud, I am sure to avoid duplicating data and eliminate installing redundant applications.

(Oops! I mean the apps! :) )

With cloud-based MDM and Data Quality tools starting to linger, I can’t wait to explore and utilize the advantages these return of dumb terminals bring to our enterprise information management field.”

On Big Data Lessons from Orbitz, Dylan Jones commented:

“The fact is that companies have always done predictive marketing, they’re just getting smarter at it.

I remember living as a student in a fairly downtrodden area that because of post code analytics meant I was bombarded with letterbox mail advertising crisis loans to consolidate debts and so on. When I got my first job and moved to a new area all of a sudden I was getting loans to buy a bigger car. The companies were clearly analyzing my wealth based on post code lifestyle data.

Fast forward and companies can do way more as you say.

Teresa Cottam (Global Telecoms Analyst) has cited the big telcos as a major driver in all this, they now consider themselves data companies so will start to offer more services to vendors to track our engagement across the entire communications infrastructure (Read more here: http://bit.ly/xKkuX6).

I’ve just picked up a shiny new Mac this weekend after retiring my long suffering relationship with Windows so it will be interesting to see what ads I get served!”

And please check out all of the commendable comments received on the blog post: Data Quality and Chicken Little Syndrome.

 

Thank You for Your Comments and Your Readership

You are Awesome — which is why receiving your comments has been the most rewarding aspect of my blogging experience over the last 400 posts.  Even if you have never posted a comment, you are still awesome — feel free to tell everyone I said so.

This entry in the series highlighted commendable comments on blog posts published between April 2012 and June 2012.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please continue commenting and stay tuned for future entries in the series.

Thank you for reading the Obsessive-Compulsive Data Quality blog.  Your readership is deeply appreciated.

 

Related Posts

Commendable Comments (Part 12) – The Third Blogiversary of OCDQ Blog

Commendable Comments (Part 11)

Commendable Comments (Part 10) – The 300th OCDQ Blog Post

730 Days and 264 Blog Posts Later – The Second Blogiversary of OCDQ Blog

OCDQ Blog Bicentennial – The 200th OCDQ Blog Post

Commendable Comments (Part 9)

Commendable Comments (Part 8)

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5) – The 100th OCDQ Blog Post

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

The Cloud is shifting our Center of Gravity

This blog post is sponsored by the Enterprise CIO Forum and HP.

Since more organizations are embracing cloud computing and cloud-based services, and some analysts are even predicting that personal clouds will soon replace personal computers, the cloudy future of our data has been weighing on my mind.

I recently discovered the website DataGravity.org, which contains many interesting illustrations and formulas about data gravity, a concept which Dave McCrory blogged about in his December 2010 post Data Gravity in the Clouds.

“Consider data as if it were a planet or other object with sufficient mass,” McCrory wrote.  “As data accumulates (builds mass) there is a greater likelihood that additional services and applications will be attracted to this data.  This is the same effect gravity has on objects around a planet.  As the mass or density increases, so does the strength of gravitational pull.  As things get closer to the mass, they accelerate toward the mass at an increasingly faster velocity.”

In my blog post What is Weighing Down your Data?, I explained the often misunderstood difference between mass, which is an intrinsic property of matter based on atomic composition, and weight, which is a gravitational force acting on matter.  By using these concepts metaphorically, we could say that mass is an intrinsic property of data, representing objective data quality, and weight is a gravitational force acting on data, representing subjective data quality.

I used a related analogy in my blog post Quality is the Higgs Field of Data.  By using data, we give data its quality, i.e., its mass.  We give data mass so that it can become the basic building blocks of what matters to us.

Historically, most of what we referred to as data silos were actually application silos because data and applications became tightly coupled due to the strong gravitational force that legacy applications exerted, preventing most data from achieving the escape velocity needed to free itself from an application.  But the laudable goal of storing your data in one easily accessible place, and then building services and applications around your data, is one of the fundamental value propositions of cloud computing.

With data accumulating in the cloud, as McCrory explained, although “services and applications have their own gravity, data is the most massive and dense, therefore it has the most gravity.  Data, if large enough, can be virtually impossible to move.”

The cloud is shifting our center of gravity because of the data gravitational field emitted by the massive amount of data being stored in the cloud.  The information technology universe, business world, and our personal (often egocentric) solar systems are just beginning to feel the effects of this massive gravitational shift.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

Quality is the Higgs Field of Data

What is Weighing Down your Data?

A Swift Kick in the AAS

Lightning Strikes the Cloud

The Partly Cloudy CIO

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

Are Applications the La Brea Tar Pits for Data?

Why does the sun never set on legacy applications?

The Good, the Bad, and the Secure

The Return of the Dumb Terminal

The UX Factor

Sometimes all you Need is a Hammer

Shadow IT and the New Prometheus

The Diffusion of the Consumerization of IT

DQ-View: The Five Stages of Data Quality

Data Quality (DQ) View is an OCDQ regular segment. Each DQ-View is a brief video discussion of a data quality key concept.

In my experience, all organizations cycle through five stages while coming to terms with the daunting challenges of data quality, which are somewhat similar to The Five Stages of Grief.  So, in this short video, I explain The Five Stages of Data Quality:

  1. Denial — Our organization is well-managed and highly profitable.  We consistently meet, or exceed, our business goals.  We obviously understand the importance of high-quality data.  Data quality issues can’t possibly be happening to us.
  2. Anger — We’re now in the midst of a financial reporting scandal, and facing considerable fines in the wake of a regulatory compliance failure.  How can this be happening to us?  Why do we have data quality issues?  Who is to blame for this?
  3. Bargaining — Okay, we may have just overreacted a little bit.  We’ll purchase a data quality tool, approve a data cleansing project, implement defect prevention, and initiate data governance.  That will fix all of our data quality issues — right?
  4. Depression — Why, oh why, do we keep having data quality issues?  Why does this keep happening to us?  Maybe we should just give up, accept our doomed fate, and not bother doing anything at all about data quality and data governance.
  5. Acceptance — We can’t fight the truth anymore.  We accept that we have to do the hard daily work of continuously improving our data quality and continuously implementing our data governance principles, policies, and procedures.

Quality is the Higgs Field of Data

Recently on Twitter, Daragh O Brien replied to my David Weinberger quote “The atoms of data hook together only because they share metadata,” by asking “So, is Quality Data the Higgs Boson of Information Management?”

I responded that Quality is the Higgs Boson of Data and Information since Quality gives Data and Information their Mass (i.e., their Usefulness).

“Now that is profound,” Daragh replied.

“That’s cute and all,” Brian Panulla interjected, “but you can’t measure Quality.  Mass is objective.  It’s more like Weight — a mass in context.”

I agreed with Brian’s great point since in a previous post I explained the often misunderstood difference between mass, an intrinsic property of matter based on atomic composition, and weight, a gravitational force acting on matter.

Using these concepts metaphorically, mass is an intrinsic property of data, representing objective data quality, whereas weight is a gravitational force acting on data, thereby representing subjective data quality.

But my previous post didn’t explain where matter theoretically gets its mass, and since this scientific mystery was radiating in the cosmic background of my Twitter banter with Daragh and Brian, I decided to use this post to attempt a brief explanation along the way to yet another data quality analogy.

As you have probably heard by now, big scientific news was recently reported about the discovery of the Higgs Boson, which, since the 1960s, the Standard Model of particle physics has theorized to be the fundamental particle associated with a ubiquitous quantum field (referred to as the Higgs Field) that gives all matter its mass by interacting with the particles that make up atoms and weighing them down.  This is foundational to our understanding of the universe because without something to give mass to the basic building blocks of matter, everything would behave the same way as the intrinsically mass-less photons of light behave, floating freely and not combining with other particles.  Therefore, without mass, ordinary matter, as we know it, would not exist.

 

Ping-Pong Balls and Maple Syrup

I like the Higgs Field explanation provided by Brian Cox and Jeff Forshaw.  “Imagine you are blindfolded, holding a ping-pong ball by a thread.  Jerk the string and you will conclude that something with not much mass is on the end of it.  Now suppose that instead of bobbing freely, the ping-pong ball is immersed in thick maple syrup.  This time if you jerk the thread you will encounter more resistance, and you might reasonably presume that the thing on the end of the thread is much heavier than a ping-pong ball.  It is as if the ball is heavier because it gets dragged back by the syrup.”

“Now imagine a sort of cosmic maple syrup that pervades the whole of space.  Every nook and cranny is filled with it, and it is so pervasive that we do not even know it is there.  In a sense, it provides the backdrop to everything that happens.”

Mass is therefore generated as a result of an interaction between the ping-pong balls (i.e., atomic particles) and the maple syrup (i.e, the Higgs Field).  However, although the Higgs Field is pervasive, it is also variable and selective, since some particles are affected by the Higgs Field more than others, and photons pass through it unimpeded, thereby remaining mass-less particles.

 

Quality — Data Gets Higgy with It

Now that I have vastly oversimplified the Higgs Field, let me Get Higgy with It by attempting an analogy for data quality based on the Higgs Field.  As I do, please remember the wise words of Karen Lopez: “All analogies are perfectly imperfect.”

Quality provides the backdrop to everything that happens when we use data.  Data in the wild, independent from use, is as carefree as the mass-less photon whizzing around at the speed of light, like a ping-pong ball bouncing along without a trace of maple syrup on it.  But once we interact with data using our sticky-maple-syrup-covered fingers, data begins to slow down, begins to feel the effects of our use.  We give data mass so that it can become the basic building blocks of what matters to us.

Some data is affected more by our use than others.  The more subjective our use, the more we weigh data down.  The more objective our use, the less we weigh data down.  Sometimes, we drag data down deep into the maple syrup, covering data up with an application layer, or bottling data into silos.  Other times, we keep data in the shallow end of the molasses swimming pool.

Quality is the Higgs Field of Data.  As users of data, we are the Higgs Bosons — we are the fundamental particles associated with a ubiquitous data quality field.  By using data, we give data its quality.  The quality of data can not be separated from its use any more than the particles of the universe can be separated from the Higgs Field.

The closest data equivalent of a photon, a ping-pong ball particle that doesn’t get stuck in the maple syrup of the Higgs Field, is Open Data, which doesn’t get stuck within silos, but is instead data freely shared without the sticky quality residue of our use.

 

Related Posts

Our Increasingly Data-Constructed World

What is Weighing Down your Data?

Data Myopia and Business Relativity

Redefining Data Quality

Are Applications the La Brea Tar Pits for Data?

Swimming in Big Data

Sometimes it’s Okay to be Shallow

Data Quality and Big Data

Data Quality and the Q Test

My Own Private Data

No Datum is an Island of Serendip

Sharing Data

Shining a Social Light on Data Quality

Last week, when I published my blog post Lightning Strikes the Cloud, I unintentionally demonstrated three important things about data quality.

The first thing I demonstrated was even an obsessive-compulsive data quality geek is capable of data defects, since I initially published the post with the title Lightening Strikes the Cloud, which is an excellent example of the difference between validity and accuracy caused by the Cupertino Effect, since although lightening is valid (i.e., a correctly spelled word), it isn’t contextually accurate.

The second thing I demonstrated was the value of shining a social light on data quality — the value of using collaborative tools like social media to crowd-source data quality improvements.  Thankfully, Julian Schwarzenbach quickly noticed my error on Twitter.  “Did you mean lightning?  The concept of lightening clouds could be worth exploring further,” Julian humorously tweeted.  “Might be interesting to consider what happens if the cloud gets so light that it floats away.”  To which I replied that if the cloud gets so light that it floats away, it could become Interstellar Computing or, as Julian suggested, the start of the Intergalactic Net, which I suppose is where we will eventually have to store all of that big data we keep hearing so much about these days.

The third thing I demonstrated was the potential dark side of data cleansing, since the only remaining trace of my data defect is a broken URL.  This is an example of not providing a well-documented audit trail, which is necessary within an organization to communicate data quality issues and resolutions.

Communication and collaboration are essential to finding our way with data quality.  And social media can help us by providing more immediate and expanded access to our collective knowledge, experience, and wisdom, and by shining a social light that illuminates the shadows cast upon data quality issues when a perception filter or bystander effect gets the better of our individual attention or undermines our collective best intentions — which, as I recently demonstrated, occasionally happens to all of us.

 

Related Posts

Data Quality and the Cupertino Effect

Are you turning Ugly Data into Cute Information?

The Importance of Envelopes

The Algebra of Collaboration

Finding Data Quality

The Wisdom of the Social Media Crowd

Perception Filters and Data Quality

Data Quality and the Bystander Effect

The Family Circus and Data Quality

Data Quality and the Q Test

Metadata, Data Quality, and the Stroop Test

The Three Most Important Letters in Data Governance

Lightning Strikes the Cloud

This blog post is sponsored by the Enterprise CIO Forum and HP.

Recent bad storms in the United States caused power outages as well as outages of a different sort for some of the companies relying on cloud computing and cloud-based services.  As the poster child for cloud providers, Amazon Web Services always makes headlines when it suffers a major outage, as it did last Friday when its Virginia cloud computing facility was struck by lightning, an incident which John Dodge examined in his recent blog post: Has Amazon's cloud grown too big, too fast?

Another thing that commonly coincides with a cloud outage is ponderances about the nebulous definition of “the cloud.”

In his recent article for The Washington Post, How a storm revealed the myth of the ‘cloud’, Dominic Basulto pondered “of all the metaphors and analogies used to describe the Internet, perhaps none is less understood than the cloud.  A term that started nearly a decade ago to describe pay-as-you-go computing power and IT infrastructure-for-rent has crossed over to the consumer realm.  It’s now to the point where many of the Internet’s most prolific companies make it a key selling point to describe their embrace of the cloud.  The only problem, as we found out this weekend, is that there really isn’t a ‘cloud’ – there’s a bunch of rooms with servers hooked up with wires and tubes.”

One of the biggest benefits of cloud computing, especially for many small businesses and start-up companies, is that it provides an organization with the ability to focus on its core competencies, allowing non-IT companies to be more business-focused.

As Basulto explained, “instead of having to devote resources and time to figuring out the computing back-end, young Internet companies like Instagram and Pinterest could concentrate on hiring the right people and developing business models worth billions.  Hooking up to the Internet became as easy as plugging into the local electricity provider, even as users uploaded millions of photos or streamed millions of videos at a time.”

But these benefits are not just for Internet companies.  In his book The Big Switch: Rewiring the World, from Edison to Google, Nicholas Carr used the history of electric grid power utilities as a backdrop and analogy for examining the potential benefits that all organizations can gain from adopting Internet-based utility (i.e., cloud) computing.

The benefits of a utility however, whether it’s electricity or cloud computing, can only be realized if the utility operates reliably.

“A temporary glitch while watching a Netflix movie is annoying,” Basulto noted, but “imagine what happens when there’s a cloud outage that affects airports, hospitals, or yes, the real-world utility grid.”  And so, whenever any utility suffers an outage, it draws attention to something we’ve become dependent on — but, in fairness, it’s also something we take for granted when it’s working.

“Maybe the late Alaska Senator Ted Stevens was right,” Basulto concluded, “maybe the Internet really is a series of tubes rather than a cloud.  If so, the company with the best plumbing wins.”  A few years ago, I published a satirical post about the cloud, which facetiously recommended that instead of beaming your data up into the cloud, bury your data down underground.

However, if plumbing, not electricity, is the better metaphor for cloud computing infrastructure, then perhaps cloud providers should start striking ground on subterranean data centers built deep enough to prevent lightning from striking the cloud again.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

The Partly Cloudy CIO

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

The Good, the Bad, and the Secure

The Return of the Dumb Terminal

A Swift Kick in the AAS

The UX Factor

Sometimes all you Need is a Hammer

Shadow IT and the New Prometheus

The Diffusion of the Consumerization of IT

Saving Private Data

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

This episode is an edited rebroadcast of a segment from the OCDQ Radio 2011 Year in Review, during which Daragh O Brien and I discuss the data privacy and data protection implications of social media, cloud computing, and big data.

Daragh O Brien is one of Ireland’s leading Information Quality and Governance practitioners.  After being born at a young age, Daragh has amassed a wealth of experience in quality information driven business change, from CRM Single View of Customer to Regulatory Compliance, to Governance and the taming of information assets to benefit the bottom line, manage risk, and ensure customer satisfaction.  Daragh O Brien is the Managing Director of Castlebridge Associates, one of Ireland’s leading consulting and training companies in the information quality and information governance space.

Daragh O Brien is a founding member and former Director of Publicity for the IAIDQ, which he is still actively involved with.  He was a member of the team that helped develop the Information Quality Certified Professional (IQCP) certification and he recently became the first person in Ireland to achieve this prestigious certification.

In 2008, Daragh O Brien was awarded a Fellowship of the Irish Computer Society for his work in developing and promoting standards of professionalism in Information Management and Governance.

Daragh O Brien is a regular conference presenter, trainer, blogger, and author with two industry reports published by Ark Group, the most recent of which is The Data Strategy and Governance Toolkit.

You can also follow Daragh O Brien on Twitter and connect with Daragh O Brien on LinkedIn.

Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.

  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.

  • Social Media Strategy — Guest Crysta Anderson of IBM Initiate explains social media strategy and content marketing, including three recommended practices: (1) Listen intently, (2) Communicate succinctly, and (3) Have fun.

  • The Fall Back Recap Show — A look back at the Best of OCDQ Radio, including discussions about Data, Information, Business-IT Collaboration, Change Management, Big Analytics, Data Governance, and the Data Revolution.

Big Data Lessons from Orbitz

One of the week’s interesting technology stories was On Orbitz, Mac Users Steered to Pricier Hotels, an article by Dana Mattioli in The Wall Street Journal, about how online travel company Orbitz used data mining to discover significant spending differences between their Mac and PC customers (who were identified by the operating system of the computer used to book reservations).

Orbitz discovered that Mac users are 40% more likely to book a four- or five-star hotel, and tend to stay in more expensive rooms, spending on average $20 to $30 more a night on hotels.  Based on this discovery, Orbitz has been experimenting with showing different hotel offers to Mac and PC visitors, ranking the more expensive hotels on the first page of search results for Mac users.

This Orbitz story is interesting because I think it provides two important lessons about big data for businesses of all sizes.

The first lesson is, as Mattioli reported, “the sort of targeting undertaken by Orbitz is likely to become more commonplace as online retailers scramble to identify new ways in which people’s browsing data can be used to boost online sales.  Orbitz lost $37 million in 2011 and its stock has fallen by more than 74% since its 2007 IPO.  The effort underscores how retailers are becoming bigger users of so-called predictive analytics, crunching reams of data to guess the future shopping habits of customers.  The goal is to tailor offerings to people believed to have the highest lifetime value to the retailer.”

The second lesson is a good example of how word of mouth has become word of data.  Shortly after the article was published, Orbitz became a trending topic on Twitter — but not in a way that the company would have hoped.  A lot of negative sentiment was expressed by Mac users claiming that they would no longer use Orbitz since they charged Mac users more than PC users.

However, this commonly expressed misunderstanding was clarified by an Orbitz spokesperson in the article, who explained that Orbitz is not charging Mac users more money for the same hotels, but instead they are simply setting the default search rank to show Mac users the more expensive hotels first.  Mac users can always re-sort the results ascending by price in order to see the same less expensive hotels that would be displayed in the default search rank used for PC users.  Orbitz is attempting to offer a customized (albeit a generalized, not personalized) user experience, but some users see it as gaming the system against them.

This Orbitz story provides two lessons about the brave new business world brought to us by big data and data science, where more companies are using predictive analytics to discover business insights, and more customers are empowering themselves with data.

Business has always resembled a battlefield.  But nowadays, data is the weapon of choice for companies and customers alike, since, in our increasing data-constructed world, big data is no longer just for big companies, and everyone is a data geek now.

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

The Return of the Dumb Terminal

This blog post is sponsored by the Enterprise CIO Forum and HP.

In his book What Technology Wants, Kevin Kelly observed “computers are becoming ever more general-purpose machines as they swallow more and more functions.  Entire occupations and their workers’ tools have been subsumed by the contraptions of computation and networks.  You can no longer tell what a person does by looking at their workplace, because 90 percent of employees are using the same tool — a personal computer.  Is that the desk of the CEO, the accountant, the designer, or the receptionist?  This is amplified by cloud computing, where the actual work is done on the net as a whole and the tool at hand merely becomes a portal to the work.  All portals have become the simplest possible window — a flat screen of some size.”

Although I am an advocate for cloud computing and cloud-based services, sometimes I can’t help but wonder if cloud computing is turning our personal computers back into that simplest of all possible windows that we called the dumb terminal.

Twenty years ago, at the beginning of my IT career, when I was a mainframe production support specialist, my employer gave me a dumb terminal to take home for connecting to the mainframe via my dial-up modem.  Since I used it late at night when dealing with nightly production issues, the aptly nicknamed green machine (its entirely text-based display used bright green characters) would make my small apartment eerily glow green, which convinced my roommate and my neighbors that I was some kind of mad scientist performing unsanctioned midnight experiments with radioactive materials.

The dumb terminal was so-called because, when not connected to the mainframe, it was essentially a giant paperweight since it provided no offline functionality.  Nowadays, our terminals (smartphones, tablets, and laptops) are smarter, but in some sense, with more functionality moving to the cloud, even though they provide varying degrees of offline functionality, our terminals get dumbed back down when they’re not connected to the web or a mobile network, because most of what we really need is online.

It can even be argued that smartphones and tablets were actually designed to be dumb terminals because they intentionally offer limited offline data storage and computing power, and are mostly based on a mobile-app-portal-to-the-cloud computing model, which is well-supported by the widespread availability of high-speed network connectivity options (broadband, mobile, Wi-Fi).

Laptops (and the dwindling number of desktops) are the last bastions of offline data storage and computing power.  Moving more of those applications and data to the cloud would help eliminate redundant applications and duplicated data, and make it easier to use the right technology for a specific business problem.  And if most of our personal computers were dumb terminals, then our smart people could concentrate more on the user experience aspects of business-enabling information technology.

Perhaps the return of the dumb terminal is a smart idea after all.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

A Swift Kick in the AAS

The UX Factor

The Partly Cloudy CIO

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

Sometimes all you Need is a Hammer

Why does the sun never set on legacy applications?

Are Applications the La Brea Tar Pits for Data?

The Diffusion of the Consumerization of IT

More Tethered by the Untethered Enterprise?

The Family Circus and Data Quality

Family Circus.png

Like many young intellectuals, the only part of the Sunday newspaper I read growing up was the color comics section, and one of my favorite comic strips was The Family Circus created by cartoonist Bil Keane.  One of the recurring themes of the comic strip was a set of invisible gremlins that the children used to shift blame for any misdeeds, including Ida Know, Not Me, and Nobody.

Although I no longer read any section of the newspaper on any day of the week, this Sunday morning I have been contemplating how this same set of invisible gremlins is used by many people throughout most organizations to shift blame for any incidents when poor data quality negatively impacted business activities, especially since, when investigating the root cause, you often find that Ida Know owns the dataNot Me is accountable for data governance, and Nobody takes responsibility for data quality.

The Graystone Effects of Big Data

As a big data geek and a big fan of science fiction, I was intrigued by Zoe Graystone, the central character of the science fiction television show Caprica, which was a spin-off prequel of the re-imagined Battlestar Galactica television show.

Zoe Graystone was a teenage computer programming genius who created a virtual reality avatar of herself based on all of the available data about her own life, leveraging roughly 100 terabytes of personal data from numerous databases.  This allowed her avatar to access data from her medical files, DNA profiles, genetic typing, CAT scans, synaptic records, psychological evaluations, school records, emails, text messages, phone calls, audio and video recordings, security camera footage, talent shows, sports, restaurant bills, shopping receipts, online search history, music lists, movie tickets, and television shows.  The avatar transformed that big data into personality and memory, and believably mimicked the real Zoe Graystone within a virtual reality environment.

The best science fiction reveals just how thin the line is that separates imagination from reality.  Over thirty years ago, around the time of the original Battlestar Galactica television show, virtual reality avatars based on massive amounts of personal data would likely have been dismissed as pure fantasy.  But nowadays, during the era of big data and data science, the idea of Zoe Graystone creating a virtual reality avatar of herself doesn’t sound so far-fetched, nor is it pure data science fiction.

“On Facebook,” Ellis Hamburger recently blogged, “you’re the sum of all your interactions and photos with others.  Foursquare began its life as a way to see what your friends are up to, but it has quickly evolved into a life-logging tool / artificial intelligence that knows you like an old friend does.”

Facebook and Foursquare are just two social media examples of our increasingly data-constructed world, which is creating a virtual reality environment where our data has become our avatar and our digital mouths are speaking volumes about us.

Big data and real data science are enabling people and businesses of all sizes to put this virtual reality environment to good use, such as customers empowering themselves with data and companies using predictive analytics to discover business insights.

I refer to the positive aspects of Big Data as the Zoe Graystone Effect.

But there are also negative aspects to the virtual reality created by our big data avatars.  For example, in his recent blog post Rethinking Privacy in an Era of Big Data, Quentin Hardy explained “by triangulating different sets of data (you are suddenly asking lots of people on LinkedIn for endorsements on you as a worker, and on Foursquare you seem to be checking in at midday near a competitor’s location), people can now conclude things about you (you’re probably interviewing for a job there).”

On the Caprica television show, Daniel Graystone (her father) used Zoe’s avatar as the basis for an operating system for a race of sentient machines known as Cylons, which ultimately lead to the Cylon Wars and the destruction of most of humanity.  A far less dramatic example from the real world, which I explained in my blog post The Data Cold War, is how companies like Google use the virtual reality created by our big data avatars against us by selling our personal data (albeit indirectly) to advertisers.

I refer to the negative aspects of Big Data as the Daniel Graystone Effect.

How have your personal life and your business activities been affected by the Graystone Effects of Big Data?

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

Sometimes all you Need is a Hammer

This blog post is sponsored by the Enterprise CIO Forum and HP.

“If all you have is a hammer, everything looks like a nail” is a popular phrase, also known as the law of the instrument, which describes an over-reliance on a familiar tool, as opposed to using “the right tool for the job.”  In information technology (IT), the law of the instrument is often invoked to justify the need to purchase the right technology to solve a specific business problem.

However, within the IT industry, it has become increasingly difficult over the years to buy the right tool for the job since many leading vendors make it nearly impossible to buy just an individual tool.  Instead, vendors want you to buy their entire tool box, filled with many tools for which you have no immediate need, and some tools which you have no idea why you would ever need.

It’d be like going to a hardware store to buy just a hammer, but the hardware store refusing to sell you a hammer without also selling you a 10-piece set of screwdrivers, a 4-piece set of pliers, a 18-piece set of wrenches, and an industrial-strength nail gun.

My point is that many new IT innovations originate from small, entrepreneurial vendors, which tend to be specialists with a very narrow focus that can provide a great source of rapid innovation.  This is in sharp contrast to the large, enterprise-class vendors, which tend to innovate via acquisition and consolidation, embedding tools and other technology components within generalized IT platforms, allowing these mega-vendors to offer end-to-end solutions and the convenience of one-vendor IT shopping.

But the consumerization of IT, driven by the unrelenting trends of cloud computingSaaS, and mobility, is fostering a return to specialization, a return to being able to buy only the information technology that you currently need — the right tool for the job, and often at the right price precisely because it’s almost always more cost-effective to buy only what you need right now.

I am not trying to criticize traditional IT vendors that remain off-premises-resistant by exclusively selling on-premises solutions, which the vendors positively call enterprise-class solutions, but their customers often come to negatively call legacy applications.

I understand the economics of the IT industry.  Vendors can make more money with fewer customers by selling on-premises IT platforms with six-or-seven-figure licenses plus five-figure annual maintenance fees, as opposed to selling cloud-based services with three-or-four-figure pay-as-you-go-cancel-anytime monthly subscriptions.  The former is the big-ticket business model of the vendorization of IT.  The latter is the big-volume business model of the consumerization of IT.  Essentially, this is a paradigm shift that makes IT more of a consumer-driven marketplace, and less of the vendor-driven marketplace it has historically been.

Although it remains true that if all you have is a hammer, everything looks like a nail, sometimes all you need is a hammer.  And when all you need is a hammer, you shouldn’t get nailed by vendors selling you more information technology than you need.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

Can Enterprise-Class Solutions Ever Deliver ROI?

Why does the sun never set on legacy applications?

The Diffusion of the Consumerization of IT

The IT Consumerization Conundrum

The UX Factor

A Swift Kick in the AAS

Shadow IT and the New Prometheus

The Cloud Security Paradox

Are Cloud Providers the Bounty Hunters of IT?

The Partly Cloudy CIO

Data Quality and the Bystander Effect

In his recent Harvard Business Review blog post Break the Bad Data Habit, Tom Redman cautioned against correcting data quality issues without providing feedback to where the data originated.  “At a minimum,” Redman explained, “others using the erred data may not spot the error.  There is no telling where it might turn up or who might be victimized.”  And correcting bad data without providing feedback to its source also denies the organization an opportunity to get to the bottom of the problem.

“And failure to provide feedback,” Redman continued, “is but the proximate cause.  The deeper root issue is misplaced accountability — or failure to recognize that accountability for data is needed at all.  People and departments must continue to seek out and correct errors.  They must also provide feedback and communicate requirements to their data sources.”

In his blog post The Secret to an Effective Data Quality Feedback Loop, Dylan Jones responded to Redman’s blog post with some excellent insights regarding data quality feedback loops and how they can help improve your data quality initiatives.

I definitely agree with Redman and Jones about the need for feedback loops, but I have found, more often than not, that no feedback at all is provided on data quality issues because of the assumption that data quality is someone else’s responsibility.

This general lack of accountability for data quality issues is similar to what is known in psychology as the Bystander Effect, which refers to people often not offering assistance to the victim in an emergency situation when other people are present.  Apparently, the mere presence of other bystanders greatly decreases intervention, and the greater the number of bystanders, the less likely it is that any one of them will help.  Psychologists believe that the reason this happens is that as the number of bystanders increases, any given bystander is less likely to interpret the incident as a problem, and less likely to assume responsibility for taking action.

In my experience, the most common reason that data quality issues are often neither reported nor corrected is that most people throughout the enterprise act like data quality bystanders, making them less likely to interpret bad data as a problem or, at the very least, not their responsibility.  But the enterprise’s data quality is perhaps most negatively affected by this bystander effect, which may make it the worst bad data habit that the enterprise needs to break.