Saving Private Data

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

This episode is an edited rebroadcast of a segment from the OCDQ Radio 2011 Year in Review, during which Daragh O Brien and I discuss the data privacy and data protection implications of social media, cloud computing, and big data.

Daragh O Brien is one of Ireland’s leading Information Quality and Governance practitioners.  After being born at a young age, Daragh has amassed a wealth of experience in quality information driven business change, from CRM Single View of Customer to Regulatory Compliance, to Governance and the taming of information assets to benefit the bottom line, manage risk, and ensure customer satisfaction.  Daragh O Brien is the Managing Director of Castlebridge Associates, one of Ireland’s leading consulting and training companies in the information quality and information governance space.

Daragh O Brien is a founding member and former Director of Publicity for the IAIDQ, which he is still actively involved with.  He was a member of the team that helped develop the Information Quality Certified Professional (IQCP) certification and he recently became the first person in Ireland to achieve this prestigious certification.

In 2008, Daragh O Brien was awarded a Fellowship of the Irish Computer Society for his work in developing and promoting standards of professionalism in Information Management and Governance.

Daragh O Brien is a regular conference presenter, trainer, blogger, and author with two industry reports published by Ark Group, the most recent of which is The Data Strategy and Governance Toolkit.

You can also follow Daragh O Brien on Twitter and connect with Daragh O Brien on LinkedIn.


Saving Private Data

Additional listening options:


Related OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

  • Data Quality and Big Data — Guest Tom Redman (aka the “Data Doc”) discusses Data Quality and Big Data, including if data quality matters less in larger data sets, and if statistical outliers represent business insights or data quality issues.
  • Data Governance Star Wars — Special Guests Rob Karel and Gwen Thomas joined this extended, and Star Wars themed, discussion about how to balance bureaucracy and business agility during the execution of data governance programs.
  • Social Media Strategy — Guest Crysta Anderson of IBM Initiate explains social media strategy and content marketing, including three recommended practices: (1) Listen intently, (2) Communicate succinctly, and (3) Have fun.
  • The Fall Back Recap Show — A look back at the Best of OCDQ Radio, including discussions about Data, Information, Business-IT Collaboration, Change Management, Big Analytics, Data Governance, and the Data Revolution.

So Long 2011, and Thanks for All the . . .

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

Don’t Panic!  Welcome to the mostly harmless OCDQ Radio 2011 Year in Review episode.  During this approximately 42 minute episode, I recap the data-related highlights of 2011 in a series of sometimes serious, sometimes funny, segments, as well as make wacky and wildly inaccurate data-related predictions about 2012.

Special thanks to my guests Jarrett Goldfedder, who discusses Big Data, Nicola Askham, who discusses Data Governance, and Daragh O Brien, who discusses Data Privacy.  Additional thanks to Rich Murnane and Dylan Jones.  And Deep Thanks to that frood Douglas Adams, who always knew where his towel was, and who wrote The Hitchhiker’s Guide to the Galaxy.


So Long 2011, and Thanks for All the . . .

Additional listening options:


Previous OCDQ Radio Episodes

Clicking on the link will take you to the episode’s blog post:

The Data Cold War

One of the many things I love about Twitter is its ability to spark ideas via real-time conversations.  For example, while live-tweeting during last week’s episode of DM Radio, the topic of which was how to get started with data governance, I tweeted about the data silo challenges and corporate cultural obstacles being discussed.

I tweeted that data is an asset only if it is a shared asset, across the silos, across the corporate culture, and that, in order to be successful with data governance, organizations must replace the mantra “my private knowledge is my power” with “our shared knowledge empowers us all.”

“That’s very socialist thinking,” Mark Madsen responded.  “Soon we’ll be having arguments about capitalizing over socializing our data.”

To which I responded that the more socialized data is, the more capitalized data can become . . . just ask Google.

“Oh no,” Mark humorously replied, “decades of political rhetoric about socialism to be ruined by a discussion of data!”  And I quipped that discussions about data have been accused of worse, and decades of data rhetoric certainly hasn’t proven very helpful in corporate politics.


Later, while ruminating on this light-hearted exchange, I wondered if we actually are in the midst of the Data Cold War.


The Data Cold War

The Cold War, which lasted approximately from 1946 to 1991, was the political, military, and economic competition between the Communist World, primarily the former Soviet Union, and the Western world, primarily the United States.  One of the major tenets of the Cold War was the conflicting ideologies of socialism and capitalism.

In enterprise data management, one of the most debated ideologies is whether or not data should be viewed as a corporate asset, especially by the for-profit corporations of capitalism, which is (even before the Cold War began), and will likely forever remain, the world’s dominant economic model.

My earlier remark that data is an asset only if it is a shared asset, across the silos, across the corporate culture, is indicative of the bounded socialist view of enterprise data.  In other words, almost no one in the enterprise data management space is suggesting that data should be shared beyond the boundary of the organization.  In this sense, advocates, including myself, of data governance are advocating socializing data within the enterprise so that data can be better capitalized as a true corporate asset.

This mindset makes sense because sharing data with the world, especially for free, couldn’t possibly be profitable — or could it?


The Master Data Management Magic Trick

The genius (and some justifiably ponder if it’s evil genius) of companies like Google and Facebook is they realized how to make money in a free world — by which I mean the world of Free: The Future of a Radical Price, the 2009 book by Chris Anderson.

By encouraging their users to freely share their own personal data, Google and Facebook ingeniously answer what David Loshin calls the most dangerous question in data management: What is the definition of customer?

How do Google and Facebook answer the most dangerous question?

A customer is a product.

This is the first step that begins what I call the Master Data Management Magic Trick.

Instead of trying to manage the troublesome master data domain of customer and link it, through sales transaction data, to the master data domain of product (products, by the way, have always been undeniably accepted as a corporate asset even though product data has not been), Google and Facebook simply eliminate the need for customers (and, by extension, eliminate the need for customer service because, since their product is free, it has no customers) by transforming what would otherwise be customers into the very product that they sell — and, in fact, the only “real” product that they have.

And since what their users perceive as their product is virtual (i.e., entirely Internet-based), it’s not really a product, but instead a free service, which can be discontinued at any time.  And if it was, who would you complain to?  And on what basis?

After all, you never paid for anything.

This is the second step that completes the Master Data Management Magic Trick — a product is a free service.

Therefore, Google and Facebook magically make both their customers and their products (i.e., master data) disappear, while simultaneously making billions of dollars (i.e., transaction data) appear in their corporate bank accounts.

(Yes, the personal data of their users is master data.  However, because it is used in an anonymized and aggregated format, it is not, nor does it need to be, managed like the master data we talk about in the enterprise data management industry.)


Google and Facebook have Capitalized Socialism

By “empowering” us with free services, Google and Facebook use the power of our own personal data against us — by selling it.

However, it’s important to note that they indirectly sell our personal data as anonymized and aggregated demographic data.

Although they do not directly sell our individually identifiable information (because, truthfully, it has very limited, and mostly no legal, value, i.e., that would be identity theft), Google and Facebook do occasionally get sued (mostly outside the United States) for violating data privacy and data protection laws.

However, it’s precisely because we freely give our personal data to them, that until, or if, laws are changed to protect us from ourselves, it’s almost impossible to prove they are doing anything illegal (again, their undeniable genius is arguably evil genius).

Google and Facebook are the exact same kind of company — they are both Internet advertising agencies.

They both sell online advertising space to other companies, which are looking to demographically target prospective customers because those companies actually do view people as potential real customers for their own real products.

The irony is that if all of their users stopped using their free service, then not only would our personal data be more private and more secure, but the new revenue streams of Google and Facebook would eventually dry up because, specifically by design, they have neither real customers nor real products.  More precisely, their only real customers (other companies) would stop buying advertising from them because no one would ever see and (albeit, even now, only occasionally) click on their ads.

Essentially, companies like Google and Facebook are winning the Data Cold War because they have capitalized socialism.

In other words, the bottom line is Google and Facebook have socialized data in order to capitalize data as a true corporate asset.


Related Posts

Freemium is the future – and the future is now

The Age of the Platform

Amazon’s Data Management Brain

The Semantic Future of MDM

A Brave New Data World

Big Data and Big Analytics

A Farscape Analogy for Data Quality

Organizing For Data Quality

Sharing Data

Song of My Data

Data in the (Oscar) Wilde

The Most August Imagination

Once Upon a Time in the Data

The Idea of Order in Data

Hell is other people’s data

Tweet 2001: A Social Media Odyssey

HAL 9000 “I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”

As I get closer and closer to my 2001st tweet on Twitter, I wanted to pause for some quiet reflection on my personal odyssey in social media – but then I decided to blog about it instead.


The Dawn of OCDQ

Except for LinkedIn, my epic drama of social media adventure and exploration started with my OCDQ blog.

In my Data Quality Pro article Blogging about Data Quality, I explained why I started this blog and discussed some of my thoughts on blogging.  Most importantly, I explained that I am neither a blogging expert nor a social media expert.

But now that I have been blogging and using social media for over six months, I feel more comfortable sharing my thoughts and personal experiences with social media without worrying about sounding like too much of an idiot (no promises, of course).



My social media odyssey began in 2007 when I created my account on LinkedIn, which I admit, I initially viewed as just an online resume.  I put little effort into my profile, only made a few connections, and only joined a few groups.

Last year (motivated by the economic recession), I started using LinkedIn more extensively.  I updated my profile with a complete job history, asked my colleagues for recommendations, expanded my network with more connections, and joined more groups.  I also used LinkedIn applications (e.g. Reading List by Amazon and Blog Link) to further enhance my profile.

My favorite feature is the LinkedIn Groups, which not only provide an excellent opportunity to connect with other users, but also provide Discussions, News (including support for RSS feeds), and Job Postings.

By no means a comprehensive list, here are some LinkedIn Groups that you may be interested in:

For more information about LinkedIn features and benefits, check out the following posts on the LinkedIn Blog:



Shortly after launching my blog in March 2009, I created my Twitter account to help promote my blog content.  In blogging, content is king, but marketing is queen.  LinkedIn (via group news feeds) is my leading source of blog visitors from social media, but Twitter isn't far behind. 

However, as Michele Goetz of Brain Vibe explained in her blog post Is Twitter an Effective Direct Marketing Tool?, Twitter has a click-through rate equivalent to direct mail.  Citing research from Pear Analytics, a “useful” tweet was found to have a shelf life of about one hour with about a 1% click-through rate on links.

In his blog post Is Twitter Killing Blogging?, Ajay Ohri of Decision Stats examined whether Twitter was a complement or a substitute for blogging.  I created a Data Quality on Twitter page on my blog in order to illustrate what I have found to be the complementary nature of tweeting and blogging. 

My ten blog posts receiving the most tweets (tracked using the Retweet Button from TweetMeme):

  1. The Nine Circles of Data Quality Hell 13 Tweets
  2. Adventures in Data Profiling (Part 1) 13 Tweets
  3. Fantasy League Data Quality 12 Tweets
  4. Not So Strange Case of Dr. Technology and Mr. Business 12 Tweets 
  5. The Fragility of Knowledge 11 Tweets
  6. The General Theory of Data Quality 9 Tweets
  7. The Very True Fear of False Positives8 Tweets
  8. Data Governance and Data Quality 8 Tweets
  9. Adventures in Data Profiling (Part 3)8 Tweets
  10. Data Quality: The Reality Show? 7 Tweets

Most of my social networking is done using Twitter (with LinkedIn being a close second).  I have also found Twitter to be great for doing research, which I complement with RSS subscriptions to blogs.

To search Twitter for data quality content:

If you are new to Twitter, then I would recommend reading the following blog posts:



I also created my Facebook account shortly after launching my blog.  Although I almost exclusively use social media for professional purposes, I do use Facebook as a way to stay connected with family and friends. 

I created a page for my blog to separate my professional and personal aspects of Facebook without the need to manage multiple accounts.  Additionally, this allows you to become a “fan” of my blog without requiring you to also become my “friend.”

A quick note on Facebook games, polls, and triviaI do not play them.  With my obsessive-compulsive personality, I have to ignore them.  Therefore, please don't be offended if for example, I have ignored your invitation to play Mafia Wars.

By no means a comprehensive list, here are some Facebook Pages or Groups that you may be interested in:


Additional Social Media Websites

Although LinkedIn, Twitter, and Facebook are my primary social media websites, I also have accounts on three of the most popular social bookmarking websites: Digg, StumbleUpon, and Delicious.

Social bookmarking can be a great promotional tool that can help blog content go viral.  However, niche content is almost impossible to get to go viral.  Data quality is not just a niche – if technology blogging was a Matryoshka (a.k.a. Russian nested) doll, then data quality would be the last, innermost doll. 

This doesn't mean that data quality isn't an important subject – it just means that you will not see a blog post about data quality hitting the front pages of mainstream social bookmarking websites anytime soon.  Dylan Jones of Data Quality Pro created DQVote, which is a social bookmarking website dedicated to sharing data quality community content.

I also have an account on FriendFeed, which is an aggregator that can consolidate content from other social media websites, blogs or anything providing a RSS feed.  My blog posts and my updates from other social media websites (except for Facebook) are automatically aggregated.  On Facebook, my personal page displays my FriendFeed content.


Social Media Tools and Services

Social media tools and services that I personally use (listed in no particular order):

  • Flock The Social Web Browser Powered by Mozilla
  • TweetDeck Connecting you with your contacts across Twitter, Facebook, MySpace and more
  • Digsby – Digsby = Instant Messaging (IM) + E-mail + Social Networks
  • – Update all of your social networks at once
  • HootSuite – The professional Twitter client
  • Twitterfeed – Feed your blog to Twitter
  • Google FeedBurner – Provide an e-mail subscription to your blog
  • TweetMeme – Add a Retweet Button to your blog
  • Squarespace Blog Platform – The secret behind exceptional websites


Social Media Strategy

As Darren Rowse of ProBlogger explained in his blog post How I use Social Media in My Blogging, Chris Brogan developed a social media strategy using the metaphor of a Home Base with Outposts.

“A home base,” explains Rowse, “is a place online that you own.”  For example, your home base could be your blog or your company's website.  “Outposts,” continues Rowse, “are places that you have an online presence out in other parts of the web that you might not own.”  For example, your outposts could be your LinkedIn, Twitter, and Facebook accounts.

According to Rowse, your Outposts will make your Home Base stronger by providing:

“Relationships, ideas, traffic, resources, partnerships, community and much more.”

Social Karma

An effective social media strategy is essential for both companies and individual professionals.  Using social media can help promote you, your expertise, your company and your products and services.

However, too many companies and individuals have a selfish social media strategy.

You should not use social media exclusively for self-promotion.  You should view social media as Social Karma.

If you can focus on helping others when you use social media, then you will get much more back than just a blog reader, a LinkedIn connection, a Twitter follower, a Facebook friend, or even a potential customer.

Yes, I use social media to promote myself and my blog content.  However, more than anything else, I use social media to listen, to learn, and to help others when I can.


Please Share Your Social Media Odyssey

As always, I am interested in hearing from you.  What have been your personal experiences with social media?