Enterprise Security is on Red Alert

This blog post is sponsored by the Enterprise CIO Forum and HP.

Enterprise security is becoming an even more important, and more complex, topic of discussion than it already was.  Especially when an organization focuses mostly on preventing external security threats, which is somewhat like, as in the photo to the left, telling employees to keep the gate closed but ignore the cloud floating over the gate and the mobile devices walking around it.

But that doesn’t mean we need to build bigger and better gates.  The more open business environment enabled by cloud and mobile technologies is here to stay, and it requires a modern data security model that can protect us from the bad without being overprotective to the point of inhibiting the good.

“Security controls cost money and have an impact on the bottom line,” Gideon Rasmussen recently blogged.  Therefore, “business management may question the need for controls beyond minimum compliance requirements.  However, adherence to compliance requirements, control frameworks, and best practices may not adequately protect sensitive or valuable information because they are not customized to the unique aspects of your organization.”

This lack of a customized security solution can also be introduced when leveraging cloud providers.  “Transparency is the capability to look inside the operational day-to-day activity of your cloud provider,” Rafal Los recently blogged.  “As a consumer, transparency means that I have audit-ability of the controls, systems, and capabilities that directly impact my consumed service.”

A further complication for enterprise security is that many cloud-based services are initiated as Shadow IT projects.  “There are actually good reasons why you may want to take a hard look at Shadow IT, as it may fundamentally put you at risk of breaching compliance,” Christian Verstraete recently blogged.  “Talking to business users, I’m often flabbergasted by how little they know of the potential risks encountered by putting information in the public cloud.”

In the science fiction universe of Star Trek, the security officers aboard the starship Enterprise, who wore red shirts, often quickly died on away missions.  Protecting your data, especially when it goes on away missions in the cloud or on mobile devices, requires your enterprise security to be on red alert — otherwise everyone in your organization might as well be wearing a red shirt.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

Securing your Digital Fortress

The Good, the Bad, and the Secure

The Data Encryption Keeper

The Cloud Security Paradox

The Cloud is shifting our Center of Gravity

Are Cloud Providers the Bounty Hunters of IT?

The Return of the Dumb Terminal

The UX Factor

A Swift Kick in the AAS

Sometimes all you Need is a Hammer

Shadow IT and the New Prometheus

The Diffusion of the Consumerization of IT

Open MIKE Podcast — Episode 01

Method for an Integrated Knowledge Environment (MIKE2.0) is an open source delivery framework for Enterprise Information Management, which provides a comprehensive methodology that can be applied across a number of different projects within the Information Management space.  For more information, click on this link: openmethodology.org/wiki/What_is_MIKE2.0

The Open MIKE Podcast is a video podcast show, hosted by Jim Harris, which discusses aspects of the MIKE2.0 framework, and features content contributed to MIKE 2.0 Wiki Articles, Blog Posts, and Discussion Forums.

 

Episode 01: Information Management Principles

If you’re having trouble viewing this video, you can watch it on Vimeo by clicking on this link: Open MIKE Podcast on Vimeo

 

MIKE2.0 Content Featured in or Related to this Podcast

Information Management Principles: openmethodology.org/wiki/Economic_Value_of_Information

Information Economics: openmethodology.org/wiki/Information_Economics

You can also find the videos and blog post summaries for every episode of the Open MIKE Podcast at: ocdqblog.com/MIKE

The Age of the Mobile Device

Bob Sutor recently blogged about mobile devices, noting that “the power of these gadgets isn’t in their touchscreens or their elegant design.  It’s in the variety of apps and communication services we can use on them to stay connected.  By thinking beyond the device, companies can prepare themselves and figure out how to make the most of this age of the mobile device.”

The disruptiveness of mobile devices to existing business models — even Internet-based ones — is difficult to overstate.  In fact, I believe the age of the mobile device will be even more disruptive than the age of the Internet, which, during the 1990s and early 2000s, disrupted entire industries and professions — the three most obvious examples being music, journalism, and publishing.

However, during those disruptions, mobile devices were in their nascent phase.  Laptops were still the dominant mobile devices and most mobile phones only made phone calls, though text messaging and e-mail soon followed.  It’s only been about five years — with the notable arrivals of the iPhone and the Kindle in 2007, the Android operating system in 2008, and the iPad in 2010 — since mobile devices started to hit their stride.  The widespread availability of connectivity options (Wi-Fi and 3G/4G broadband), the shift to more cloud-based services, and, as Sutor noted, in 2011, for the first time ever, shipments of smartphones exceeded total PC shipments, all appears to forecast that the age of the mobile device will be an age of massive — and rapid — disruption.

The IBM Midmarket white paper A Smarter Approach to Customer Relationship Management (CRM) notes that “mobile is becoming the customers’ preferred communications means for multiple channels.  As customers go mobile and sales teams strive to meet customers’ needs, midsize companies are enabling mobile CRM.  They are optimizing Web sites for wireless devices and deploying mobile apps directly linked into the contact centers.  They are purchasing apps for particular devices and are buying solutions that store CRM data on them when offline, and update the information when Internet access is restored.  This enables sales teams to quickly acquire customer histories and respond with offerings tailored to their desires.”

As Sutor concluded, “mobile devices are a springboard into the future, where the apps can significantly improve the quality of our personal or business lives by allowing us to do things we have never done before.”  I agree that mobile devices are a springboard into a future that allows us, as well as our businesses and our customers, to do things we have never done before.

The age of the mobile device is the future — and the future is now.  Is your midsize business ready?

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

Balancing the IT Budget

This blog post is sponsored by the Enterprise CIO Forum and HP.

While checking out the new Knowledge Vaults on the Enterprise CIO Forum, I came across the Genefa Murphy blog post How IT Debt is Crippling the Enterprise, which included three recommendations for alleviating some of that crippling IT debt.

The first recommendation was application retirement.  As I have previously blogged, applications become retirement-resistant because applications and data have historically been so tightly coupled, making most of what are referred to as data silos actually application silos.  Therefore, in order to help de-cripple IT debt, organizations need to de-couple applications and data, not only by allowing more data to float up into the cloud, but also, as Murphy noted, instituting better procedures for data archival, which helps more easily identify applications for retirement that have become merely containers for unused data.

The second recommendation was cutting the IT backlog.  “One of the main reasons for IT debt,” Murphy explained, “is the fact that the enterprise is always trying to keep up with the latest and greatest trends, technologies and changes.”  I have previously blogged about this as The Diderot Effect of New Technology.  By better identifying how up-to-date the IT backlog is, and how well — if at all — it still reflects current business needs, an organization can skip needless upgrades and enhancement requests, and not only eliminate some of the IT debt, but also better prioritize efforts so that IT functions as a business enabler.

The third recommendation was performing more architectural reviews, which, Murphy explained, “is less about getting rid of old debt and more about making sure new debt does not accumulate.  Since IT teams don’t often have the time to do this (as they are concerned with getting a working solution to the customer ASAP), it is a good idea to have this as a parallel effort led by a technology or architectural review group outside of the project teams but still closely linked.”

Although it’s impossible to completely balance the IT budget, and IT debt doesn’t cause an overall budget deficit, reducing costs associated with business-enabling technology does increase the potential for a surplus of financial success for the enterprise.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

Why does the sun never set on legacy applications?

Are Applications the La Brea Tar Pits for Data?

The Diffusion of the Consumerization of IT

Sometimes all you Need is a Hammer

Shadow IT and the New Prometheus

The UX Factor

The Return of the Dumb Terminal

A Swift Kick in the AAS

The Cloud is shifting our Center of Gravity

Lightning Strikes the Cloud

The Partly Cloudy CIO

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

The Good, the Bad, and the Secure

The Diderot Effect of New Technology

Demystifying Social Media

In this eight-minute video, I attempt to demystify social media, which is often over-identified with the technology that enables it, when, in fact, we have always been social, and we have always used media, because social media is about human communication, about humans communicating in the same ways they have always communicated, by sharing images, memories, stories, words, and more often nowadays, we are communicating by sharing photographs, videos, and messages via social media status updates.

This video briefly discusses the three social media services used by my local Toastmasters clubPinterest, Vimeo, and Twitter — and concludes with an analogy inspired by The Emerald City and The Yellow Brick Road from The Wizard of Oz:

If you are having trouble viewing this video, then you can watch it on Vimeo by clicking on this link: Demystifying Social Media

You can also watch a regularly updated page of my videos by clicking on this link: OCDQ Videos

 

Social Karma Blog Series

 

Related Social Media Posts

Brevity is the Soul of Social Media

The Wisdom of the Social Media Crowd

The Challenging Gift of Social Media

Can Social Media become a Universal Translator?

The Two U’s and the Three C’s

Quality is more important than Quantity

Listening and Broadcasting

Please don’t become a Zombie

Exercise Better Data Management

Recently on Twitter, Daragh O Brien and I discussed his proposed concept.  “After Big Data,” Daragh tweeted, “we will inevitably begin to see the rise of MOData as organizations seek to grab larger chunks of data and digest it.  What is MOData?  It’s MO’Data, as in MOre Data. Or Morbidly Obese Data.  Only good data quality and data governance will determine which.”

Daragh asked if MO’Data will be the Big Data Killer.  I said only if MO’Data doesn’t include MO’BusinessInsight, MO’DataQuality, and MO’DataPrivacy (i.e., more business insight, more data quality, and more data privacy).

“But MO’Data is about more than just More Data,” Daragh replied.  “It’s about avoiding Morbidly Obese Data that clogs data insight and data quality, etc.”

I responded that More Data becomes Morbidly Obese Data only if we don’t exercise better data management practices.

Agreeing with that point, Daragh replied, “Bring on MOData and the Pilates of Data Quality and Data Governance.”

To slightly paraphrase lines from one of my favorite movies — Airplane! — the Cloud is getting thicker and the Data is getting laaaaarrrrrger.  Surely I know well that growing data volumes is a serious issue — but don’t call me Shirley.

Whether you choose to measure it in terabytes, petabytes, exabytes, HoardaBytes, or how much reality bites, the truth is we were consuming way more than our recommended daily allowance of data long before the data management industry took a tip from McDonald’s and put the word “big” in front of its signature sandwich.  (Oh great . . . now I’m actually hungry for a Big Mac.)

But nowadays with silos replicating data, as well as new data, and new types of data, being created and stored on a daily basis, our data is resembling the size of Bob Parr in retirement, making it seem like not even Mr. Incredible in his prime possessed the super strength needed to manage all of our data.  Those were references to the movie The Incredibles, where Mr. Incredible was a superhero who, after retiring into civilian life under the alias of Bob Parr, elicits the observation from this superhero costume tailor: “My God, you’ve gotten fat.”  Yes, I admit not even Helen Parr (aka Elastigirl) could stretch that far for a big data joke.

A Healthier Approach to Big Data

Although Daragh’s concerns about morbidly obese data are valid, no superpowers (or other miracle exceptions) are needed to manage all of our data.  In fact, it’s precisely when we are so busy trying to manage all of our data that we hoard countless bytes of data without evaluating data usage, gathering data requirements, or planning for data archival.  It’s like we are trying to lose weight by eating more and exercising less, i.e., consuming more data and exercising less data quality and data governance.  As Daragh said, only good data quality and data governance will determine whether we get more data or morbidly obese data.

Losing weight requires a healthy approach to both diet and exercise.  A healthy approach to diet includes carefully choosing the food you consume and carefully controlling your portion size.  A healthy approach to exercise includes a commitment to exercise on a regular basis at a sufficient intensity level without going overboard by spending several hours a day, every day, at the gym.

Swimming is a great form of exercise, but swimming in big data without having a clear business objective before you jump into the pool is like telling your boss that you didn’t get any work done because you decided to spend all day working out at the gym.

Carefully choosing the data you consume and carefully controlling your data portion size is becoming increasingly important since big data is forcing us to revisit information overload.  However, the main reason that traditional data management practices often become overwhelmed by big data is because traditional data management practices are not always the right approach.

We need to acknowledge that some big data use cases differ considerably from traditional ones.  Data modeling is still important and data quality still matters, but how much data modeling and data quality is needed before big data can be effectively used for business purposes will vary.  In order to move the big data discussion forward, we have to stop fiercely defending our traditional perspectives about structure and quality.  We also have to stop fiercely defending our traditional perspectives about analytics, since there will be some big data use cases where depth and detailed analysis may not be necessary to provide business insight.

Better than Big or More

Jim Ericson explained that your data is big enough.  Rich Murnane explained that bigger isn’t better, better is better.  Although big data may indeed be followed by more data that doesn’t necessarily mean we require more data management in order to prevent more data from becoming morbidly obese data.  I think that we just need to exercise better data management.

 

Related Posts

Demystifying Master Data Management

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

During this episode, special guest John Owens and I attempt to demystify master data management (MDM) by explaining the three types of data (Transaction, Domain, Master) and the four master data entities (Party, Product, Location, Asset), as well as, and perhaps the most important concept of all, the Party-Role Relationship, which is where we find many of the terms commonly used to describe the Party master data entity (e.g., Customer, Supplier, Employee).

John Owens is a thought leader, consultant, mentor, and writer in the worlds of business and data modelling, data quality, and master data management (MDM).  He has built an international reputation as a highly innovative specialist in these areas and has worked in and led multi-million dollar projects in a wide range of industries around the world.

John Owens has a gift for identifying the underlying simplicity in any enterprise, even when shrouded in complexity, and bringing it to the surface.  He is the creator of the Integrated Modelling Method (IMM), which is used by business and data analysts around the world.  Later this year, John Owens will be formally launching the IMM Academy, which will provide high quality resources, training, and mentoring for business and data analysts at all levels.

You can also follow John Owens on Twitter and connect with John Owens on Linkedin.  And if you’re looking for a MDM course, consider the online course from John Owens, which you can find by clicking on this link: MDM Online Course (Affiliate Link)

Popular OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Demystifying Data Science — Guest Melinda Thielbar, a Ph.D. Statistician, discusses what a data scientist does and provides a straightforward explanation of key concepts such as signal-to-noise ratio, uncertainty, and correlation.
  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • The Johari Window of Data Quality — Guest Martin Doyle discusses helping people better understand their data and assess its business impacts, not just the negative impacts of bad data quality, but also the positive impacts of good data quality.
  • Studying Data Quality — Guest Gordon Hamilton discusses the key concepts from recommended data quality books, including those which he has implemented in his career as a data quality practitioner.

Commendable Comments (Part 13)

Welcome to the 400th Obsessive-Compulsive Data Quality (OCDQ) blog post!  I am commemorating this milestone with the 13th entry in my ongoing series for expressing gratitude to my readers for their truly commendable comments on my blog posts.

 

Commendable Comments

On Will Big Data be Blinded by Data Science?, Meta Brown commented:

“Your concern is well-founded. Knowing how few businesses make really good use of the small data they’ve had around all along, it’s easy to imagine that they won’t do any better with bigger data sets.

I wrote some hints for those wallowing into the big data mire in my post, Better than Brute Force: Big Data Analytics Tips. But the truth is that many organizations won’t take advantage of the ideas that you are presenting, or my tips, especially as the datasets grow larger. That’s partly because they have no history in scientific methods, and partly because the data science movement is driving employers to search for individuals with heroically large skill sets.

Since few, if any, people truly meet these expectations, those hired will have real human limitations, and most often they will be people who know much more about data storage and manipulation than data analysis and applications.”

On Will Big Data be Blinded by Data Science?, Mike Urbonas commented:

“The comparison between scientific inquiry and business decision making is a very interesting and important one. Successfully serving a customer and boosting competitiveness and revenue does require some (hopefully unique) insights into customer needs. Where do those insights come from?

Additionally, scientists also never stop questioning and improving upon fundamental truths, which I also interpret as not accepting conventional wisdom — obviously an important trait of business managers.

I recently read commentary that gave high praise to the manager utilizing the scientific method in his or her decision-making process. The author was not a technologist, but rather none other than Peter Drucker, in writings from decades ago.

I blogged about Drucker’s commentary, data science, the scientific method vs. business decision making, and I’d value your and others’ input: Business Managers Can Learn a Lot from Data Scientists.”

On Word of Mouth has become Word of Data, Vish Agashe commented:

“I would argue that listening to not only customers but also business partners is very important (and not only in retail but in any business). I always say that, even if as an organization you are not active in the social world, assume that your customers, suppliers, employees, competitors are active in the social world and they will talk about you (as a company), your people, products, etc.

So it is extremely important to tune in to those conversations and evaluate its impact on your business. A dear friend of mine ventured into the restaurant business a few years back. He experienced a little bit of a slowdown in his business after a great start. He started surveying his customers, brought in food critiques to evaluate if the food was a problem, but he could not figure out what was going on. I accidentally stumbled upon Yelp.com and noticed that his restaurant’s rating had dropped and there were some complaints recently about services and cleanliness (nothing major though).

This happened because he had turnover in his front desk staff. He was able to address those issues and was able to reach out to customers who had bad experience (some of them were frequent visitors). They were able to go back and comment and give newer ratings to his business. This helped him with turning the corner and helped with the situation.

This was a big learning moment for me about the power of social media and the need for monitoring it.”

On Data Quality and the Bystander Effect, Jill Wanless commented:

“Our organization is starting to develop data governance processes and one of the processes we have deliberately designed is to get to the root cause of data quality issues.

We’ve designed it so that the errors that are reported also include the userid and the system where the data was generated. Errors are then filtered by function and the business steward responsible for that function is the one who is responsible for determining and addressing the root cause (which of course may require escalation to solve).

The business steward for the functional area has the most at stake in the data and is typically the most knowledgeable as to the process or system that may be triggering the error. We have yet to test this as we are currently in the process of deploying a pilot stewardship program.

However, we are very confident that it will help us uncover many of the causes of the data quality problems and with lots of PLAN, DO, CHECK, and ACT, our goal is to continuously improve so that our need for stewardship eventually (many years away no doubt) is reduced.”

On The Return of the Dumb Terminal, Prashanta Chandramohan commented:

“I can’t even imagine what it’s like to use this iPad I own now if I am out of network for an hour. Supposedly the coolest thing to own and a breakthrough innovation of this decade as some put it, it’s nothing but a dumb terminal if I do not have 3G or Wi-Fi connectivity.

Putting most of my documents, notes, to-do’s, and bookmarked blogs for reading later (e.g., Instapaper) in the cloud, I am sure to avoid duplicating data and eliminate installing redundant applications.

(Oops! I mean the apps! :) )

With cloud-based MDM and Data Quality tools starting to linger, I can’t wait to explore and utilize the advantages these return of dumb terminals bring to our enterprise information management field.”

On Big Data Lessons from Orbitz, Dylan Jones commented:

“The fact is that companies have always done predictive marketing, they’re just getting smarter at it.

I remember living as a student in a fairly downtrodden area that because of post code analytics meant I was bombarded with letterbox mail advertising crisis loans to consolidate debts and so on. When I got my first job and moved to a new area all of a sudden I was getting loans to buy a bigger car. The companies were clearly analyzing my wealth based on post code lifestyle data.

Fast forward and companies can do way more as you say.

Teresa Cottam (Global Telecoms Analyst) has cited the big telcos as a major driver in all this, they now consider themselves data companies so will start to offer more services to vendors to track our engagement across the entire communications infrastructure (Read more here: http://bit.ly/xKkuX6).

I’ve just picked up a shiny new Mac this weekend after retiring my long suffering relationship with Windows so it will be interesting to see what ads I get served!”

And please check out all of the commendable comments received on the blog post: Data Quality and Chicken Little Syndrome.

 

Thank You for Your Comments and Your Readership

You are Awesome — which is why receiving your comments has been the most rewarding aspect of my blogging experience over the last 400 posts.  Even if you have never posted a comment, you are still awesome — feel free to tell everyone I said so.

This entry in the series highlighted commendable comments on blog posts published between April 2012 and June 2012.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please continue commenting and stay tuned for future entries in the series.

Thank you for reading the Obsessive-Compulsive Data Quality blog.  Your readership is deeply appreciated.

 

Related Posts

Commendable Comments (Part 12) – The Third Blogiversary of OCDQ Blog

Commendable Comments (Part 11)

Commendable Comments (Part 10) – The 300th OCDQ Blog Post

730 Days and 264 Blog Posts Later – The Second Blogiversary of OCDQ Blog

OCDQ Blog Bicentennial – The 200th OCDQ Blog Post

Commendable Comments (Part 9)

Commendable Comments (Part 8)

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5) – The 100th OCDQ Blog Post

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

The Cloud is shifting our Center of Gravity

This blog post is sponsored by the Enterprise CIO Forum and HP.

Since more organizations are embracing cloud computing and cloud-based services, and some analysts are even predicting that personal clouds will soon replace personal computers, the cloudy future of our data has been weighing on my mind.

I recently discovered the website DataGravity.org, which contains many interesting illustrations and formulas about data gravity, a concept which Dave McCrory blogged about in his December 2010 post Data Gravity in the Clouds.

“Consider data as if it were a planet or other object with sufficient mass,” McCrory wrote.  “As data accumulates (builds mass) there is a greater likelihood that additional services and applications will be attracted to this data.  This is the same effect gravity has on objects around a planet.  As the mass or density increases, so does the strength of gravitational pull.  As things get closer to the mass, they accelerate toward the mass at an increasingly faster velocity.”

In my blog post What is Weighing Down your Data?, I explained the often misunderstood difference between mass, which is an intrinsic property of matter based on atomic composition, and weight, which is a gravitational force acting on matter.  By using these concepts metaphorically, we could say that mass is an intrinsic property of data, representing objective data quality, and weight is a gravitational force acting on data, representing subjective data quality.

I used a related analogy in my blog post Quality is the Higgs Field of Data.  By using data, we give data its quality, i.e., its mass.  We give data mass so that it can become the basic building blocks of what matters to us.

Historically, most of what we referred to as data silos were actually application silos because data and applications became tightly coupled due to the strong gravitational force that legacy applications exerted, preventing most data from achieving the escape velocity needed to free itself from an application.  But the laudable goal of storing your data in one easily accessible place, and then building services and applications around your data, is one of the fundamental value propositions of cloud computing.

With data accumulating in the cloud, as McCrory explained, although “services and applications have their own gravity, data is the most massive and dense, therefore it has the most gravity.  Data, if large enough, can be virtually impossible to move.”

The cloud is shifting our center of gravity because of the data gravitational field emitted by the massive amount of data being stored in the cloud.  The information technology universe, business world, and our personal (often egocentric) solar systems are just beginning to feel the effects of this massive gravitational shift.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

Quality is the Higgs Field of Data

What is Weighing Down your Data?

A Swift Kick in the AAS

Lightning Strikes the Cloud

The Partly Cloudy CIO

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

Are Applications the La Brea Tar Pits for Data?

Why does the sun never set on legacy applications?

The Good, the Bad, and the Secure

The Return of the Dumb Terminal

The UX Factor

Sometimes all you Need is a Hammer

Shadow IT and the New Prometheus

The Diffusion of the Consumerization of IT

DQ-View: The Five Stages of Data Quality

Data Quality (DQ) View is an OCDQ regular segment. Each DQ-View is a brief video discussion of a data quality key concept.

In my experience, all organizations cycle through five stages while coming to terms with the daunting challenges of data quality, which are somewhat similar to The Five Stages of Grief.  So, in this short video, I explain The Five Stages of Data Quality:

  1. Denial — Our organization is well-managed and highly profitable.  We consistently meet, or exceed, our business goals.  We obviously understand the importance of high-quality data.  Data quality issues can’t possibly be happening to us.
  2. Anger — We’re now in the midst of a financial reporting scandal, and facing considerable fines in the wake of a regulatory compliance failure.  How can this be happening to us?  Why do we have data quality issues?  Who is to blame for this?
  3. Bargaining — Okay, we may have just overreacted a little bit.  We’ll purchase a data quality tool, approve a data cleansing project, implement defect prevention, and initiate data governance.  That will fix all of our data quality issues — right?
  4. Depression — Why, oh why, do we keep having data quality issues?  Why does this keep happening to us?  Maybe we should just give up, accept our doomed fate, and not bother doing anything at all about data quality and data governance.
  5. Acceptance — We can’t fight the truth anymore.  We accept that we have to do the hard daily work of continuously improving our data quality and continuously implementing our data governance principles, policies, and procedures.

Quality is the Higgs Field of Data

Recently on Twitter, Daragh O Brien replied to my David Weinberger quote “The atoms of data hook together only because they share metadata,” by asking “So, is Quality Data the Higgs Boson of Information Management?”

I responded that Quality is the Higgs Boson of Data and Information since Quality gives Data and Information their Mass (i.e., their Usefulness).

“Now that is profound,” Daragh replied.

“That’s cute and all,” Brian Panulla interjected, “but you can’t measure Quality.  Mass is objective.  It’s more like Weight — a mass in context.”

I agreed with Brian’s great point since in a previous post I explained the often misunderstood difference between mass, an intrinsic property of matter based on atomic composition, and weight, a gravitational force acting on matter.

Using these concepts metaphorically, mass is an intrinsic property of data, representing objective data quality, whereas weight is a gravitational force acting on data, thereby representing subjective data quality.

But my previous post didn’t explain where matter theoretically gets its mass, and since this scientific mystery was radiating in the cosmic background of my Twitter banter with Daragh and Brian, I decided to use this post to attempt a brief explanation along the way to yet another data quality analogy.

As you have probably heard by now, big scientific news was recently reported about the discovery of the Higgs Boson, which, since the 1960s, the Standard Model of particle physics has theorized to be the fundamental particle associated with a ubiquitous quantum field (referred to as the Higgs Field) that gives all matter its mass by interacting with the particles that make up atoms and weighing them down.  This is foundational to our understanding of the universe because without something to give mass to the basic building blocks of matter, everything would behave the same way as the intrinsically mass-less photons of light behave, floating freely and not combining with other particles.  Therefore, without mass, ordinary matter, as we know it, would not exist.

 

Ping-Pong Balls and Maple Syrup

I like the Higgs Field explanation provided by Brian Cox and Jeff Forshaw.  “Imagine you are blindfolded, holding a ping-pong ball by a thread.  Jerk the string and you will conclude that something with not much mass is on the end of it.  Now suppose that instead of bobbing freely, the ping-pong ball is immersed in thick maple syrup.  This time if you jerk the thread you will encounter more resistance, and you might reasonably presume that the thing on the end of the thread is much heavier than a ping-pong ball.  It is as if the ball is heavier because it gets dragged back by the syrup.”

“Now imagine a sort of cosmic maple syrup that pervades the whole of space.  Every nook and cranny is filled with it, and it is so pervasive that we do not even know it is there.  In a sense, it provides the backdrop to everything that happens.”

Mass is therefore generated as a result of an interaction between the ping-pong balls (i.e., atomic particles) and the maple syrup (i.e, the Higgs Field).  However, although the Higgs Field is pervasive, it is also variable and selective, since some particles are affected by the Higgs Field more than others, and photons pass through it unimpeded, thereby remaining mass-less particles.

 

Quality — Data Gets Higgy with It

Now that I have vastly oversimplified the Higgs Field, let me Get Higgy with It by attempting an analogy for data quality based on the Higgs Field.  As I do, please remember the wise words of Karen Lopez: “All analogies are perfectly imperfect.”

Quality provides the backdrop to everything that happens when we use data.  Data in the wild, independent from use, is as carefree as the mass-less photon whizzing around at the speed of light, like a ping-pong ball bouncing along without a trace of maple syrup on it.  But once we interact with data using our sticky-maple-syrup-covered fingers, data begins to slow down, begins to feel the effects of our use.  We give data mass so that it can become the basic building blocks of what matters to us.

Some data is affected more by our use than others.  The more subjective our use, the more we weigh data down.  The more objective our use, the less we weigh data down.  Sometimes, we drag data down deep into the maple syrup, covering data up with an application layer, or bottling data into silos.  Other times, we keep data in the shallow end of the molasses swimming pool.

Quality is the Higgs Field of Data.  As users of data, we are the Higgs Bosons — we are the fundamental particles associated with a ubiquitous data quality field.  By using data, we give data its quality.  The quality of data can not be separated from its use any more than the particles of the universe can be separated from the Higgs Field.

The closest data equivalent of a photon, a ping-pong ball particle that doesn’t get stuck in the maple syrup of the Higgs Field, is Open Data, which doesn’t get stuck within silos, but is instead data freely shared without the sticky quality residue of our use.

 

Related Posts

Our Increasingly Data-Constructed World

What is Weighing Down your Data?

Data Myopia and Business Relativity

Redefining Data Quality

Are Applications the La Brea Tar Pits for Data?

Swimming in Big Data

Sometimes it’s Okay to be Shallow

Data Quality and Big Data

Data Quality and the Q Test

My Own Private Data

No Datum is an Island of Serendip

Sharing Data

Shining a Social Light on Data Quality

Last week, when I published my blog post Lightning Strikes the Cloud, I unintentionally demonstrated three important things about data quality.

The first thing I demonstrated was even an obsessive-compulsive data quality geek is capable of data defects, since I initially published the post with the title Lightening Strikes the Cloud, which is an excellent example of the difference between validity and accuracy caused by the Cupertino Effect, since although lightening is valid (i.e., a correctly spelled word), it isn’t contextually accurate.

The second thing I demonstrated was the value of shining a social light on data quality — the value of using collaborative tools like social media to crowd-source data quality improvements.  Thankfully, Julian Schwarzenbach quickly noticed my error on Twitter.  “Did you mean lightning?  The concept of lightening clouds could be worth exploring further,” Julian humorously tweeted.  “Might be interesting to consider what happens if the cloud gets so light that it floats away.”  To which I replied that if the cloud gets so light that it floats away, it could become Interstellar Computing or, as Julian suggested, the start of the Intergalactic Net, which I suppose is where we will eventually have to store all of that big data we keep hearing so much about these days.

The third thing I demonstrated was the potential dark side of data cleansing, since the only remaining trace of my data defect is a broken URL.  This is an example of not providing a well-documented audit trail, which is necessary within an organization to communicate data quality issues and resolutions.

Communication and collaboration are essential to finding our way with data quality.  And social media can help us by providing more immediate and expanded access to our collective knowledge, experience, and wisdom, and by shining a social light that illuminates the shadows cast upon data quality issues when a perception filter or bystander effect gets the better of our individual attention or undermines our collective best intentions — which, as I recently demonstrated, occasionally happens to all of us.

 

Related Posts

Data Quality and the Cupertino Effect

Are you turning Ugly Data into Cute Information?

The Importance of Envelopes

The Algebra of Collaboration

Finding Data Quality

The Wisdom of the Social Media Crowd

Perception Filters and Data Quality

Data Quality and the Bystander Effect

The Family Circus and Data Quality

Data Quality and the Q Test

Metadata, Data Quality, and the Stroop Test

The Three Most Important Letters in Data Governance

Lightning Strikes the Cloud

This blog post is sponsored by the Enterprise CIO Forum and HP.

Recent bad storms in the United States caused power outages as well as outages of a different sort for some of the companies relying on cloud computing and cloud-based services.  As the poster child for cloud providers, Amazon Web Services always makes headlines when it suffers a major outage, as it did last Friday when its Virginia cloud computing facility was struck by lightning, an incident which John Dodge examined in his recent blog post: Has Amazon's cloud grown too big, too fast?

Another thing that commonly coincides with a cloud outage is ponderances about the nebulous definition of “the cloud.”

In his recent article for The Washington Post, How a storm revealed the myth of the ‘cloud’, Dominic Basulto pondered “of all the metaphors and analogies used to describe the Internet, perhaps none is less understood than the cloud.  A term that started nearly a decade ago to describe pay-as-you-go computing power and IT infrastructure-for-rent has crossed over to the consumer realm.  It’s now to the point where many of the Internet’s most prolific companies make it a key selling point to describe their embrace of the cloud.  The only problem, as we found out this weekend, is that there really isn’t a ‘cloud’ – there’s a bunch of rooms with servers hooked up with wires and tubes.”

One of the biggest benefits of cloud computing, especially for many small businesses and start-up companies, is that it provides an organization with the ability to focus on its core competencies, allowing non-IT companies to be more business-focused.

As Basulto explained, “instead of having to devote resources and time to figuring out the computing back-end, young Internet companies like Instagram and Pinterest could concentrate on hiring the right people and developing business models worth billions.  Hooking up to the Internet became as easy as plugging into the local electricity provider, even as users uploaded millions of photos or streamed millions of videos at a time.”

But these benefits are not just for Internet companies.  In his book The Big Switch: Rewiring the World, from Edison to Google, Nicholas Carr used the history of electric grid power utilities as a backdrop and analogy for examining the potential benefits that all organizations can gain from adopting Internet-based utility (i.e., cloud) computing.

The benefits of a utility however, whether it’s electricity or cloud computing, can only be realized if the utility operates reliably.

“A temporary glitch while watching a Netflix movie is annoying,” Basulto noted, but “imagine what happens when there’s a cloud outage that affects airports, hospitals, or yes, the real-world utility grid.”  And so, whenever any utility suffers an outage, it draws attention to something we’ve become dependent on — but, in fairness, it’s also something we take for granted when it’s working.

“Maybe the late Alaska Senator Ted Stevens was right,” Basulto concluded, “maybe the Internet really is a series of tubes rather than a cloud.  If so, the company with the best plumbing wins.”  A few years ago, I published a satirical post about the cloud, which facetiously recommended that instead of beaming your data up into the cloud, bury your data down underground.

However, if plumbing, not electricity, is the better metaphor for cloud computing infrastructure, then perhaps cloud providers should start striking ground on subterranean data centers built deep enough to prevent lightning from striking the cloud again.

This blog post is sponsored by the Enterprise CIO Forum and HP.

 

Related Posts

The Partly Cloudy CIO

Are Cloud Providers the Bounty Hunters of IT?

The Cloud Security Paradox

The Good, the Bad, and the Secure

The Return of the Dumb Terminal

A Swift Kick in the AAS

The UX Factor

Sometimes all you Need is a Hammer

Shadow IT and the New Prometheus

The Diffusion of the Consumerization of IT

Saving Private Data

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

This episode is an edited rebroadcast of a segment from the OCDQ Radio 2011 Year in Review, during which Daragh O Brien and I discuss the data privacy and data protection implications of social media, cloud computing, and big data.

Daragh O Brien is one of Ireland’s leading Information Quality and Governance practitioners.  After being born at a young age, Daragh has amassed a wealth of experience in quality information driven business change, from CRM Single View of Customer to Regulatory Compliance, to Governance and the taming of information assets to benefit the bottom line, manage risk, and ensure customer satisfaction.  Daragh O Brien is the Managing Director of Castlebridge Associates, one of Ireland’s leading consulting and training companies in the information quality and information governance space.

Daragh O Brien is a founding member and former Director of Publicity for the IAIDQ, which he is still actively involved with.  He was a member of the team that helped develop the Information Quality Certified Professional (IQCP) certification and he recently became the first person in Ireland to achieve this prestigious certification.

In 2008, Daragh O Brien was awarded a Fellowship of the Irish Computer Society for his work in developing and promoting standards of professionalism in Information Management and Governance.

Daragh O Brien is a regular conference presenter, trainer, blogger, and author with two industry reports published by Ark Group, the most recent of which is The Data Strategy and Governance Toolkit.

You can also follow Daragh O Brien on Twitter and connect with Daragh O Brien on LinkedIn.

Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.

  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.

  • Social Media Strategy — Guest Crysta Anderson of IBM Initiate explains social media strategy and content marketing, including three recommended practices: (1) Listen intently, (2) Communicate succinctly, and (3) Have fun.

  • The Fall Back Recap Show — A look back at the Best of OCDQ Radio, including discussions about Data, Information, Business-IT Collaboration, Change Management, Big Analytics, Data Governance, and the Data Revolution.