Saving Private Data

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

This episode is an edited rebroadcast of a segment from the OCDQ Radio 2011 Year in Review, during which Daragh O Brien and I discuss the data privacy and data protection implications of social media, cloud computing, and big data.

Daragh O Brien is one of Ireland’s leading Information Quality and Governance practitioners.  After being born at a young age, Daragh has amassed a wealth of experience in quality information driven business change, from CRM Single View of Customer to Regulatory Compliance, to Governance and the taming of information assets to benefit the bottom line, manage risk, and ensure customer satisfaction.  Daragh O Brien is the Managing Director of Castlebridge Associates, one of Ireland’s leading consulting and training companies in the information quality and information governance space.

Daragh O Brien is a founding member and former Director of Publicity for the IAIDQ, which he is still actively involved with.  He was a member of the team that helped develop the Information Quality Certified Professional (IQCP) certification and he recently became the first person in Ireland to achieve this prestigious certification.

In 2008, Daragh O Brien was awarded a Fellowship of the Irish Computer Society for his work in developing and promoting standards of professionalism in Information Management and Governance.

Daragh O Brien is a regular conference presenter, trainer, blogger, and author with two industry reports published by Ark Group, the most recent of which is The Data Strategy and Governance Toolkit.

You can also follow Daragh O Brien on Twitter and connect with Daragh O Brien on LinkedIn.


Saving Private Data

Additional listening options:


Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • Social Media Strategy — Guest Crysta Anderson of IBM Initiate explains social media strategy and content marketing, including three recommended practices: (1) Listen intently, (2) Communicate succinctly, and (3) Have fun.
  • The Fall Back Recap Show — A look back at the Best of OCDQ Radio, including discussions about Data, Information, Business-IT Collaboration, Change Management, Big Analytics, Data Governance, and the Data Revolution.

Turning Data Silos into Glass Houses

Although data silos are denounced as inherently bad since they complicate the coordination of enterprise-wide business activities, since they are often used to support some of those business activities, whether or not data silos are good or bad is a matter of perspective.  For example, data silos are bad when different business units are redundantly storing and maintaining their own private copies of the same data, but data silos are good when they are used to protect sensitive data that should not be shared.

Providing the organization with a single system of record, a single version of the truth, a single view, a golden copy, or a consolidated repository of trusted data has long been the anti-data-silo siren song of enterprise data warehousing (EDW), and more recently, of master data management (MDM).  Although these initiatives can provide significant business value, somewhat ironically, many data silos start with EDW or MDM data that was replicated and customized in order to satisfy the particular needs of an operational project or tactical initiative.  This customized data either becomes obsolesced after the conclusion of its project or initiative — or it continues to be used because it is satisfying a business need that EDW and MDM are not.

One of the early goals of a new data governance program should be to provide the organization with a substantially improved view of how it is using its data — including data silos — to support its operational, tactical, and strategic business activities.

Data governance can help the organization catalog existing data sources, build a matrix of data usage and related business processes and technology, identify potential external reference sources to use for data enrichment, as well as help define the metrics that meaningfully measure data quality using business-relevant terminology.

The transparency provided by this combined analysis of the existing data, business, and technology landscape will provide a more comprehensive overview of enterprise data management problems, which will help the organization better evaluate any existing data and technology re-use and redundancies, as well as whether investing in new technology will be necessary.

Data governance can help topple data silos by first turning them into glass houses through transparency, empowering the organization to start throwing stones at those glass houses that must be eliminated.  And when data silos are allowed to persist, they should remain glass houses, clearly illustrating whether or not they have the business-justified reasons for continued use.


Related Posts

Data and Process Transparency

The Good Data

The Data Outhouse

Time Silos

Sharing Data

Single Version of the Truth

Beyond a “Single Version of the Truth”

The Quest for the Golden Copy

The Idea of Order in Data

Hell is other people’s data

Data and Process Transparency

Illustration via the SlideShare presentation: The Social Intranet

How do you know if you have poor data quality?

How do you know what your business processes and technology are doing to your data?

Waiting for poor data quality to reveal itself is like waiting until the bread pops up to see if you burnt your toast, at which point it is too late to save the bread—after all, it’s not like you can reactively cleanse the burnt toast.

Extending the analogy, let’s imagine that the business process is toasting, the technology is the toaster, and the data is the toast, which is being prepared for an end user.  (We could also imagine that the data is the bread and information is the toast.)

A more proactive approach to data quality begins with data and process transparency, which can help you monitor the quality of your data in much the same way as a transparent toaster could help you monitor your bread during the toasting process.

Performing data profiling and data quality assessments can provide insight into the quality of your data, but these efforts must include identifying the related business processes, technology, and end users of the data being analyzed.

However, the most important aspect is to openly share this preliminary analysis of the data, business, and technology landscape since it provides detailed insights about potential problems, which helps the organization better evaluate possible solutions.

Data and process transparency must also be maintained as improvement initiatives are implemented.  Regularly repeat the cycle of analysis and publication of its findings, which provides a feedback loop for tracking progress and keeping everyone informed.

The downside of transparency is that it can reveal how bad things are, but without this awareness, improvement is not possible.


Related Posts

Video: Oh, the Data You’ll Show!

Finding Data Quality

Why isn’t our data quality worse?

Days Without A Data Quality Issue

The Diffusion of Data Governance

Adventures in Data Profiling (Part 8)

Schrödinger’s Data Quality

Data Gazers