So Long 2009, and Thanks for All the . . .

Before I look ahead to the coming New Year and wonder what it may (or may not) bring, I wanted to pause, reflect on, and in the following OCDQ Video, share some of the many joys I was thankful for 2009 bringing to me.

If you are having trouble viewing this video, then you can watch it on Vimeo by clicking on this link: OCDQ Video

 

Thank You

Thank you all—and I do mean every single one of you—thank you for everything.

Happy New Year!!!

Recently Read: December 21, 2009

Recently Read is an OCDQ regular segment.  Each entry provides links to blog posts, articles, books, and other material I found interesting enough to share.  Please note “recently read” is literal – therefore what I share wasn't necessarily recently published.

 

Data Quality

For simplicity, “Data Quality” also includes Data Governance, Master Data Management, and Business Intelligence.

  • Welcome to DQ Directions – In this blog post, Dylan Jones of Data Quality Pro formally announced the DQ Directions online conference, which will debut in Q2 2010, and will feature presentations from experts and industry thought leaders specializing in data quality, data governance, and master data management.

     

  • Ways to 'Communivate' your Data Issues – In her Purple Cow of a blog post, Jill Wanless (aka Sheezaredhead) explains that ‘Communivate’ is a combination of the words communicate and innovate, and it means to communicate in an innovative way, which she does regarding the importance of data quality.

     

  • ’Tis the Season for a Data Governance Carol – Part 1 and Part 2 – In his excellent two-part series, Rob Paller of Baseline Consulting uses a Dickensian framework to explain the importance of data governance and data quality – and the fact that there isn’t a simple framework to blindly follow for Data Governance.

     

  • The “Santa Intelligence” Team – An excellent Christmas-themed blog post from Paul Boal, in which we learn that Santa does indeed have a Business Intelligence team.

     

  • Data quality is for life not just for Christmas – In this Diary of a Marketing Insight Guy blog post, Simon Daniels reminds us data quality can be a gift that will keep on giving—if data quality management is built into the heart of an organization’s processes and operations.

     

  • Finding a home for MDM – In his second post on the DataFlux Community of Experts, Charles Blyth examines where master data management (MDM) fits within your overall enterprise architecture.

     

  • The Decade of Data: Seven Trends to Watch in 2010 – In his blog post on Informatica Perspectives, Joe McKendrick examines some up-and-coming trends that he predicts will shape the data management space in 2010.

     

  • Are we ready for all this data? – In his blog post, Rich Murnane uses some recent news stories to ponder if even us experienced data geeks are really ready for the amount of data we're going to need to manage due to the unrelenting increases in data volumes.

 

Social Media

For simplicity, “Social Media” also includes Blogging, Writing, Social Networking, and Online Marketing.

 

Book Quotes

An eclectic list of quotes from some recently read (and/or simply my favorite) books.

  • From Crush It! by Gary Vaynerchuk – “Your business and your personal brand need to be one and the same...Your latest tweet and comment on Facebook and most recent blog post—that's your résumé now...It's a whole new world, build your personal brand and get ready for it.”

     

  • From A Whole New Mind by Daniel Pink – “Empathy is neither a deviation from intelligence nor the single route to it.  Sometimes we need detachment; many other times we need attachment.  The people who will thrive will be those who can toggle between the two.” 

     

  • From Connected by Nicholas Christakis and James Fowler – “Just as brains can do things that no single neuron can do, so can social networks do things that no single person can do...our connections to other people matter...most of all it is about what makes us uniquely human...To know who we are, we must understand how we are connected.”

Recently Read: November 28, 2009

Recently Read is an OCDQ regular segment.  Each entry provides links to blog posts, articles, books, and other material I found interesting enough to share.  Please note “recently read” is literal – therefore what I share wasn't necessarily recently published.

 

Data Quality Blog Posts

For simplicity, “Data Quality” also includes Data Governance, Master Data Management, and Business Intelligence.

 

Social Media Blog Posts

For simplicity, “Social Media” also includes Blogging, Social Networking, and Online Marketing.

 

Book Quotes

An eclectic list of quotes from some recently read (and/or simply my favorite) books.

  • From The Wisdom of Crowds by James Surowiecki – “Refuse to allow the merit of an idea to be determined by the status of the person advocating it.”

     

  • From Purple Cow by Seth Godin – “We mistakenly believe that criticism leads to failure.”

     

  • From How We Decide by Jonah Lehrer – “The best decision-makers don't despair.  Instead, they become students of error, determined to learn from what went wrong.”

     

  • From The Whuffie Factor by Tara Hunt – “Whuffie is the residual outcome—the currency—of your reputation.  You lose or gain it based on positive or negative actions, your contributions to the community, and what people think of you.”

     

  • From Trust Agents by Chris Brogan and Julien Smith – “You accrue social capital as a side benefit of doing good, but doing good by itself is its own reward.”

DQ-Tip: “Data quality is about more than just improving your data...”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“Data quality is about more than just improving your data.

Ultimately, the goal is improving your organization.”

This DQ-Tip is from Tony Fisher's great book The Data Asset: How Smart Companies Govern Their Data for Business Success.

In the book, Fisher explains that one of the biggest mistakes organizations make is not viewing their data as a corporate asset.  This common misconception often prevents data quality from being rightfully viewed a critical priority. 

Data quality is misperceived to be an activity performed just for the sake of improving data.  When in fact, data quality is an activity performed for the sake of improving business processes.

“Better data leads to better decisions,” explains Fisher, “which ultimately leads to better business.  Therefore, the very success of your organization is highly dependent on the quality of your data.”

 

Related Posts

DQ-Tip: “...Go talk with the people using the data”

DQ-Tip: “Data quality is primarily about context not accuracy...”

DQ-Tip: “Don't pass bad data on to the next person...”

Brevity is the Soul of Social Media

“Why day is day, night night, and time is time,
Were nothing but to waste night, day and time.
Therefore, since brevity is the soul of wit,
I will be brief ...”

Within the wide world of social media, one of the most common features is some form of social networking, microblogging, or short message service that allow users to share brief status updates.  Some social media sites are almost entirely built on only this feature (e.g., Twitter) whereas others (e.g., Facebook, LinkedIn) include it among a list of many other features. 

Either way, these status updates have created a rather pithy platform many people argue is incompatible with meaningful communication, especially of a professional nature.  I must admit this was also my initial opinion of social media.

However, I now believe not only is it the soul of wit, brevity is the soul of social media – and, in fact, a very good soul.

 

Short Attention Span Theater

I doubt attention deficit will still be considered a disorder ten years from now.  We are living increasingly faster-paced lives in an increasingly faster-paced world.  The pervasiveness of the Internet and the rapid proliferation of powerful mobile technology is making our world a smaller and smaller place and our lives a more and more crowded space. 

We have become so accustomed to multi-tasking that the very concept of focusing our attention on only one thing at a time somehow seems inherently wrong to us.  All the world's a stage within this short attention span theater.  And all of us are not merely players, we have been cast in several simultaneous roles.

Time management has always been important, but nowadays it is even more essential.  This is especially true when it comes to social media, which, if we can effectively and efficiently use it, has great personal and professional potential.  Amber Naslund recently provided an excellent blog series on social media time management that I highly recommend.

 

The Power of Pith

I admit I am a long-winded talker or, as a favorite (canceled) television show would say, “conversationally anal-retentive.”  In the past (slightly less now), I was also known for e-mail messages even Leo Tolstoy would declare to be far too long.

Therefore, it may be surprising to learn I am addicted to Twitter.  How could I possibly constrain myself to only 140 characters?  No, I don't use ellipses to extend my thoughts across multiple tweets (although I admit I am often tempted to do so). 

I wholeheartedly agree with Jennifer Blanchard, who explained how Twitter makes you a better writer.  When forced to be concise, you have to focus on exactly what you want to say, using as few words as possible. 

The power of pith means reducing your message to its bare essence.  In order to engage in effective dialogue on the stage of our short attention span theater, this is a required skill we all must master – and not just when we are on Twitter.

For those who argue this simply regresses human communication back to our days of monosyllabic grunting, I invite you to read the excellent recent blog post Is Twitter a Complex Adaptive System? written by Venessa Miemis

Although you should read all of it, the point I need here will be found under Insight #4 toward the end of the post.  Miemis shares a study that reveals using Twitter can not only improve communication, but actually build intelligence. 

The collaborative communication enabled by social media platforms can actually contribute to a growing collective intelligence made up of all of us.  The power of pith is the wisdom of crowds.

 

Blogging with Brevity

Brevity is the soul of all social media and yes, this includes blogging as well.  Some view blogging as social media's last bastion of robust communication.  You can take your time and use all the words you want on your blog, right?  Sure, as long as you have no interest in anyone actually reading your blog.

Some bloggers get cranky with me when I emphasize the Three C’s – meaning your blog posts should be:

  1. Clear – Get to the point and stay on point
  2. Concise – No longer than necessary
  3. Consumable – Formatted to be easily read on a computer screen

Concise is usually the main cranky causing culprit because everyone interprets it to mean “write really short posts.” 

One blogger told me he has “never met a subordinate clause he didn't like,” thereby expressing his fondness for writing compound-complex sentences.  For the non-writers, this means really long (but grammatically correct) sentences oftentimes requiring you to read them three or four times before truly comprehending their full meaning.

Don't get me wrong.  This particular blogger is an incredibly gifted writer known for his absolutely brilliant blog posts.  My only true criticism of his writing style is it truly requires a significant time commitment.

Michelle Russell does a great job explaining how to write with a knife.  No, not literally.  Writing with a knife means writing for yourself, but editing for your readers.  Editing is the hardest part of writing, but also the most important. 

Blogging with brevity doesn't necessarily mean “write really short posts.”  Being concise simply means taking out anything that doesn't need to be included.  For example, you really didn't need to read the additional jokes and Shakespearean references included in the first draft of this post.

 

The Future of Brevity is Bright

Some predict the size limits of message service standards and status updates will be increased.  Others predict new social media platforms will be based on different paradigms.  Either way, innovation will eventually deliver an ability to be more verbose.

However, barring some major scientific breakthrough (or some major breakdown in the space-time continuum), there will still only be 24 hours in a day.  Therefore, no matter what happens, I am certain the future of brevity is bright.

Neither the world nor people in it are likely to slow down.  Our attention spans will remain short.  Our time management skills will remain vigilant.  We will communicate through the power of pith, brevity will remain the soul of both wit and social media, and hopefully, we will all “live long and prosper.”

 

Related Posts

The Mullet Blogging Manifesto

Collablogaunity

Podcast: Your Blog, Your Voice

Collablogaunity

The meteoric rise of the Internet coupled with social media has created an amazing medium that is enabling people who are separated by vast distances and disparate cultures to come together, communicate, and collaborate in ways few would have thought possible just a few decades ago.  Blogging, especially when effectively integrated with social networking, can be one of the most powerful aspects of social media.

The great advantage to blogging as a medium, as opposed to books, newspapers, magazines, and even presentations, is that blogging is not just about broadcasting a message. 

This is not to say that books, newspapers, and magazines aren't useful (they certainly can be) or that presentations lack an interactive component (they certainly should not).  I simply believe that, when done well, blogging better facilities effective communication by starting a conversation, encouraging collaboration, and fostering a true sense of community.

Mashing together the words collaboration, blog, and community, I use the term collablogaunity — which is pronounced “Call a Blog a Unity” — to describe how remarkable blogs do this remarkably well.

 

Conversation

Blogging is a conversation — with your readers. 

I love the sound of my own voice and I talk to myself all the time (even in public).  However, the two-way conversation that blogging provides via comments from my readers greatly improves the quality of my blog content —  because it helps me better appreciate the difference between what I know and what I only think I know.

Without comments, the conversation is only one way.  Engaging readers in dialogue and discussion allows some of your points to be made for you by those who take the time to comment as opposed to you just telling everyone how you see the world.

Blogging isn't about using the Internet as your own personal bullhorn for broadcasting your message.  In her wonderful book The Whuffie Factor, Tara Hunt explains that you really need to:

“Turn the bullhorn around: stop talking, start listening, and create continuous conversations.”

Respond to the comments you receive (but never feed the troll).  You don't have to respond immediately.  Sometimes, the conversation will go more smoothly without your involvement as your readers talk amongst themselves.  Other times, your response will help continue the conversation and encourage participation from others. 

Always demonstrate that feedback is both welcome and appreciated.  Make sure to never talk down to your readers (either in your blog post or your comment responses).  It is perfectly fine to disagree and debate, just don't denigrate.  

In a recent guest post on ProBlogger, Rob McPhillips explained: 

“If instead, you are all the time only seeking praise and approval from everyone, then there is nothing solid, consistent or certain about your blog and so ultimately it will never gather a sizeable core of die hard fans.  Only drive by readers who scan a post and never look back.” 

Collaboration

Blogging is a collaboration — with other bloggers.

While conversation is primarily between you and your readers, collaboration is primarily between you and other bloggers.  Although you may be inclined to view other bloggers as “the competition,” especially those within your own niche, this would be a mistake.  Yes, it is true that blogs are competing with each other for readers.  However, sustainable success is achieved through collaboration and friendly competition with your peers.

Brian Clark has explained in the past and continues to exemplify that strategic collaboration is the secret to 21st century success.  Clark has stated that if he had to reduce his recipe for success to just three ingredients, it would be content, copywriting, and collaboration.  And if he had to give up two of those, then he'd keep collaboration.

In their terrific book Trust Agents, Chris Brogan and Julien Smith explain that although people in most cultures view themselves as the central hero in their life's story, the reality is that you need to build an army because you can't do it all alone.

Collaboration between bloggers is mainly about networking and cross-promotion.  You should network with other bloggers, especially those within your own niche.  This can be accomplished a number of ways including e-mail introductions, Twitter direct messages (if the other blogger is following you), LinkedIn connection requests, or Facebook friend requests.

As with any networking, the most important thing is being genuine.  As Darren Rowse and Chris Garrett explained in their highly recommended ProBlogger book, when you network with other bloggers, keep it real, be specific, keep it brief without being rude, and explain why you are interested in connecting.  They rightfully emphasize the importance of that last point.

As we all know, although content may be king, marketing is queen.  Networking with other bloggers can help you get the word out about your brilliant blog and its penchant for publishing posts that everyone must read.  Adding other bloggers to your blogroll, linking to their posts when applicable to your content, and leaving meaningful comments on their posts are not only recommended best practices of netiquette, they are also just the right thing to do.

Too many bloggers have a selfish networking and marketing strategy.  They only promote their own content and then wonder why nobody reads their blog.  I am fond of referring to all social media as Social Karma.  Focus on helping other bloggers promote their content and they will likely be more willing to return the favor.  However, don't misunderstand this technique to be a pathetic peer pressure tactic in other words, I re-tweeted your blog post, why didn't you re-tweet my blog post?

One last point on collaboration is to set realistic expectations — for others and for yourself.  You should definitely try to help others when you can.  However, you simply can't help everyone.  Don't let people take advantage of your generosity. 

Politely, but firmly, say no when you need to say no.  Also extend the same courtesy to other people when they turn you down (or simply ignore you) when you try to connect with them or when you ask them for their help. 

Mean and selfish people definitely suck.  But let's face it, nobody's perfect — we all have bad days, we all occasionally say and do stupid things, and we all occasionally treat people worse than they deserve to be treated.  So don't be too hard on people when they disappoint you, because tomorrow it will probably be your turn to have a bad day.

 

Community

Blogging is a community service.

If you truly believe and actually practice the principles of both conversation and collaboration, then viewing blogging as a community service comes naturally.  You will truly be more interested in actually listening to what your readers have to say, and less interested in just broadcasting your message.  You will see your words as simply the catalyst that gets the conversation started, and when necessary, helps continue the discussion. 

You will see friends not foes when encountering your blogging peers.  You will help them celebrate their successes and quickly recover from their failures.  You will help others when you can and without worrying about what's in it for you.

As James Chartrand says, you will welcome people to your blog because you view blogging as a festival of people, a community strengthened by people, where everyone can speak up with great care and attention, sharing thoughts and views while openly accepting differing opinions.  Blogging is a community service providing a wealth of experience, thoughts and knowledge being shared by all sorts of participants.

In the closing keynote of this year's BlogWorld conference, Chris Brogan explained (from notes taken by David B. Thomas):

“Make it about them.  Stop looking at this as a cult of me. 

It has to be about your audience.  Turn them into a community. 

The difference between an audience and a community is the way you face the chairs. 

The difference between an audience and a community:

One will fall on its sword for you and the other will watch you fall.”

Collablogaunity

Pronounced: “Call a Blog a Unity”

There are literally millions of blogs on the Internet today.  Your blog (to quote Seth Godin) is “either remarkable or invisible.”

Remarkable blogs primarily do three things:

  1. Start conversations
  2. Encourage collaboration
  3. Foster a true sense of community

Remarkable blogs are collablogaunities.  Is your blog a collablogaunity?

 

Related Posts

The Mullet Blogging Manifesto

Brevity is the Soul of Social Media

Podcast: Your Blog, Your Voice

Beyond a “Single Version of the Truth”

This post is involved in a good-natured contest (i.e., a blog-bout) with two additional bloggers: Henrik Liliendahl Sørensen and Charles Blyth.  Our contest is a Blogging Olympics of sorts, with the United States, Denmark, and England competing for the Gold, Silver, and Bronze medals in an event we are calling “Three Single Versions of a Shared Version of the Truth.” 

Please take the time to read all three posts and then vote for who you think has won the debate (see poll below).  Thanks!

 

The “Point of View” Paradox

In the early 20th century, within his Special Theory of Relativity, Albert Einstein introduced the concept that space and time are interrelated entities forming a single continuum, and therefore the passage of time can be a variable that could change for each individual observer.

One of the many brilliant insights of special relativity was that it could explain why different observers can make validly different observations – it was a scientifically justifiable matter of perspective. 

It was Einstein's apprentice, Obi-Wan Kenobi (to whom Albert explained “Gravity will be with you, always”), who stated:

“You're going to find that many of the truths we cling to depend greatly on our own point of view.”

The Data-Information Continuum

In the early 21st century, within his popular blog post The Data-Information Continuum, Jim Harris introduced the concept that data and information are interrelated entities forming a single continuum, and that speaking of oneself in the third person is the path to the dark side.

I use the Dragnet definition for data – it is “just the facts” collected as an abstract description of the real-world entities that the enterprise does business with (e.g., customers, vendors, suppliers).

Although a common definition for data quality is fitness for the purpose of use, the common challenge is that data has multiple uses – each with its own fitness requirements.  Viewing each intended use as the information that is derived from data, I define information as data in use or data in action.

Quality within the Data-Information Continuum has both objective and subjective dimensions.  Data's quality is objectively measured separate from its many uses, while information's quality is subjectively measured according to its specific use.

 

Objective Data Quality

Data quality standards provide a highest common denominator to be used by all business units throughout the enterprise as an objective data foundation for their operational, tactical, and strategic initiatives. 

In order to lay this foundation, raw data is extracted directly from its sources, profiled, analyzed, transformed, cleansed, documented and monitored by data quality processes designed to provide and maintain universal data sources for the enterprise's information needs. 

At this phase of the architecture, the manipulations of raw data must be limited to objective standards and not be customized for any subjective use.  From this perspective, data is now fit to serve (as at least the basis for) each and every purpose.

 

Subjective Information Quality

Information quality standards (starting from the objective data foundation) are customized to meet the subjective needs of each business unit and initiative.  This approach leverages a consistent enterprise understanding of data while also providing the information necessary for day-to-day operations.

But please understand: customization should not be performed simply for the sake of it.  You must always define your information quality standards by using the enterprise-wide data quality standards as your initial framework. 

Whenever possible, enterprise-wide standards should be enforced without customization.  The key word within the phrase “subjective information quality standards” is standards — as opposed to subjective, which can quite often be misinterpreted as “you can do whatever you want.”  Yes you can – just as long as you have justifiable business reasons for doing so.

This approach to implementing information quality standards has three primary advantages.  First, it reinforces a consistent understanding and usage of data throughout the enterprise.  Second, it requires each business unit and initiative to clearly explain exactly how they are using data differently from the rest of your organization, and more important, justify why.  Finally, all deviations from enterprise-wide data quality standards will be fully documented. 

 

The “One Lie Strategy”

A common objection to separating quality standards into objective data quality and subjective information quality is the enterprise's significant interest in creating what is commonly referred to as a “Single Version of the Truth.”

However, in his excellent book Data Driven: Profiting from Your Most Important Business Asset, Thomas Redman explains:

“A fiendishly attractive concept is...'a single version of the truth'...the logic is compelling...unfortunately, there is no single version of the truth. 

For all important data, there are...too many uses, too many viewpoints, and too much nuance for a single version to have any hope of success. 

This does not imply malfeasance on anyone's part; it is simply a fact of life. 

Getting everyone to work from a single version of the truth may be a noble goal, but it is better to call this the 'one lie strategy' than anything resembling truth.”

Beyond a “Single Version of the Truth”

In the classic 1985 film Mad Max Beyond Thunderdome, the title character arrives in Bartertown, ruled by the evil Auntie Entity, where people living in the post-apocalyptic Australian outback go to trade for food, water, weapons, and supplies.  Auntie Entity forces Mad Max to fight her rival Master Blaster to the death within a gladiator-like arena known as Thunderdome, which is governed by one simple rule:

“Two men enter, one man leaves.”

I have always struggled with the concept of creating a “Single Version of the Truth.”  I imagine all of the key stakeholders from throughout the enterprise arriving in Corporatetown, ruled by the Machiavellian CEO known only as Veritas, where all business units and initiatives must go to request funding, staffing, and continued employment.  Veritas forces all of them to fight their Master Data Management rivals within a gladiator-like arena known as Meetingdome, which is governed by one simple rule:

“Many versions of the truth enter, a Single Version of the Truth leaves.”

For any attempted “version of the truth” to truly be successfully implemented within your organization, it must take into account both the objective and subjective dimensions of quality within the Data-Information Continuum. 

Both aspects of this shared perspective of quality must be incorporated into a “Shared Version of the Truth” that enforces a consistent enterprise understanding of data, but that also provides the information necessary to support day-to-day operations.

The Data-Information Continuum is governed by one simple rule:

“All validly different points of view must be allowed to enter,

In order for an all encompassing Shared Version of the Truth to be achieved.”

 

You are the Judge

This post is involved in a good-natured contest (i.e., a blog-bout) with two additional bloggers: Henrik Liliendahl Sørensen and Charles Blyth.  Our contest is a Blogging Olympics of sorts, with the United States, Denmark, and England competing for the Gold, Silver, and Bronze medals in an event we are calling “Three Single Versions of a Shared Version of the Truth.” 

Please take the time to read all three posts and then vote for who you think has won the debate.  A link to the same poll is provided on all three blogs.  Therefore, wherever you choose to cast your vote, you will be able to view an accurate tally of the current totals. 

The poll will remain open for one week, closing at midnight on November 19 so that the “medal ceremony” can be conducted via Twitter on Friday, November 20.  Additionally, please share your thoughts and perspectives on this debate by posting a comment below.  Your comment may be copied (with full attribution) into the comments section of all of the blogs involved in this debate.

 

Related Posts

Poor Data Quality is a Virus

The General Theory of Data Quality

The Data-Information Continuum

The Mullet Blogging Manifesto

Blogging is more art than science.  My personal blogging style can perhaps best be described as mullet blogging.  No, not the “business in the front, party in the back” haircut that I tried to rock back in the '80s (I couldn't pull it off, had to settle for a “tail” and had to cut that off because it made me look like an idiot – OK, more idiotic than usual).  By mullet blogging I mean:

“Take yourself and your blog seriously, but still have a sense of humor about both.”

As a mullet blogger, I hold the following truths to be self-evident, but I decided to write them down anyway.

 

Blogging is All about You

Not you meaning me, the blogger — you meaning you, the reader.

Blogging should always focus on the reader and provide them assistance with a specific problem, even if that problem is boredom or simply a need for entertainment.  Don't worry about your readers agreeing with you.  They will either thank you for your help or tell you that you're an idiot – either way, you have started a conversation, which should always be your blogging goal.

Brian Clark recently shared something to think about using the following quote from Robert McKee:

“When talented people write badly it’s generally for one of two reasons:

Either they’re blinded by an idea they feel compelled to prove,

Or they’re driven by an emotion they must express.

When talented people write well, it is generally for this reason:

They’re moved by a desire to touch the audience.”

B = U2C3

Blogging = Unique and Useful content that is Clear, Concise, and Consumable.

The conventional blogging wisdom is to be both Unique and Useful.  Although I normally like to defy conventions, I have to agree with the wise ones on these fundamentals.

One of the most important aspects of being unique is writing effective titles.  Most potential readers scan titles to determine whether or not they will click and read more.  There is obviously a delicate balance between effective titles and “baiting,” which will only alienate potential readers. 

If you write a compelling title that makes me click through to an interesting post, then “You Rock!”  However, if you write a “Shock and Awe” title followed by “Aw Shucks” content, then “You Suck!” 

Therefore, your content also has to be unique – your topic, position, voice, or a combination of all three.

One of the most important aspects of useful is “infotainment” – that combination of information and entertainment that, when done well, can turn potential readers into raving fans.  Just don't forget about the previous section – your content has to be informative and entertaining to your readers.

The key to good blogging is to follow the Three C’s – Clear, Concise, Consumable

The attention span of a blog reader is not the same as a reader of books, newspapers (they still exist, right?), magazine articles, or the audience for presentations.  Most people only scan blogs, rarely read a full post and even more rarely leave a comment – regardless of how well the blog post is written. 

Write blog posts that get to the point and stay on point (i.e., clear), are no longer than they need to be (i.e., concise), and are formatted to be easy to read on a computer screen (i.e., consumable).

 

Laugh, Think, Comment

The three things that you want your readers to do.

Although it is not as blatantly formulaic as the title of the previous section, here is another method to my blogging madness:

  1. Open with a joke
  2. Say something thought provoking
  3. End with a call to action

It's as easy as 1-2-3!  In my defense, I didn't say open with a good joke.  But seriously, humor can be a great way to start a conversation and hold your readers' attention for those few precious additional seconds while you are getting to your point.  Obviously, there will be times when the seriousness of your subject would make comedy inappropriate, and if you are not naturally inclined to use humor, then you shouldn't try to force it.

Thought provoking content doesn't have to mean deep thoughts.  There is no need to channel Jean-Paul Sartre, for example.  However, to paraphrase Sartre: “Hell is other people's boring blogs.”

Obviously, comments are not the only type of call to action.  However, blogging is a conversation facilitated by the dialogue and discussion provided via comments from your readers.  Without comments, the conversation is only one way. 

I love the sound of my own voice and I talk to myself all the time (even in public).  However, the two-way conversation provided via comments not only greatly improves the quality of my blog content — much more importantly, it helps me better appreciate the difference between what I know and what I only think I know.

As Darren Rowse and Chris Garrett explained in their highly recommended ProBlogger book: “even the most popular blogs tend to attract only about a 1 percent commenting rate.”  Therefore, don't be too disappointed if you are not getting many comments.  Take that statistic as a challenge to motivate you to write blog posts that your readers simply can not resist commenting on. 

Respond to the comments you do receive.  This continues the two-way conversation and encourages comments from other readers.  Make sure to never talk down to your readers (either in your blog post or your comment responses).  It is perfectly fine to disagree and debate, just don't denigrate. 

Obviously, you should block all spam (leading argument for using comment moderation) and never feed the troll.

 

Stories and Metaphors and Analogies!  Oh, my!

I've a feeling we're not in Kansas anymore.  Especially me, since I live in Iowa.

Darren Rowse recently shared some great tips about why stories are an effective communication tool for your blog, including a list of some of the different types of stories you can tell.

My blog uses a lot of metaphors and analogies (and sometimes just plain silliness) in an attempt to make my posts more interesting.  This is necessary because I write about a niche topic, which although important, is also rather dull.

James Chartrand uses the term Method Blogging as (yes, you guessed it) a metaphor for blogging by comparing it to method acting.  Try experimenting with different styles like an actor experimenting with different types of roles and movie genres. 

Oftentimes, using stories, metaphors, and analogies in my content works very well.  But I admit, sometimes it simply sucks. 

However, I have never been afraid to look like an idiot.  After all, we idiots are important members of society – we make everyone else look smart by comparison.

 

The King, Queen, and Crown Prince of Blogging

Meet the Blogging Royal Family: Content, Marketing, and Context.

Content is King.  The primary reason that people are (or aren't) reading your blog is because of your content.

Marketing is Queen.  “If you blog it, they will read.” Ah, no they won't — this ain't Field of DreamsSome of the best written blogs on the Series of Tubes get hardly any love because they get hardly any marketing.  In addition to providing RSS and e-mail feeds, I use social media (e.g., Twitter, Facebook, LinkedIn) to promote my blog content.

However, too many bloggers have a selfish social media strategy.  Don't use it exclusively for self-promotion.  View social media as Social Karma.  Focus on helping others and you will get much more back than just a blog reader, a LinkedIn connection, a Twitter follower, or a Facebook friend.  In addition to blog promotion (which is important), I use social media to listen, to learn, and to help others when I can.

Larry Brooks recently explained that although content may still be king, at the very least, you must pay homage to the new Crown Prince — Context.  To paraphrase Brooks, context comes from clarity about your blogging goals, juxtaposed against the expectations and tolerances of your readers.  Basically, this above all: to thine own readers be true.

 

Emerson on Blogging

“Nothing can bring you peace but yourself.”

One of my favorite writers is Ralph Waldo Emerson.  The quote that started this section was pure Emerson.  What follows is a slight paraphrasing of one of my all-time favorite passages, which comes from his essay on Self-Reliance:

“What I must do is all that concerns me, not what the people think.  This rule, equally arduous in real and in online life, may serve for the whole distinction between greatness and meanness.  It is the harder because you will always find those who think they know what is your duty better than you know it.  It is easy in the world to live after the world's opinion; it is easy in solitude to live after our own; but the great blogger is one who in the midst of the blogosphere, keeps with perfect sweetness the independence of solitude.”

Bottom line — BE YOURSELF — Let your own personality shine through.  Make people feel like they are having a conversation with a real person and not just someone who is blogging what they think people want to read.

I hope that you found at least some of this manifesto helpful.  I also hope to see more of you around the blogosphere.

I'll be the balding blogger who used to almost have a mullet...

 

Related Posts

Collablogaunity

Brevity is the Soul of Social Media

Podcast: Your Blog, Your Voice

Customer Incognita

Many enterprise information initiatives are launched in order to unravel that riddle, wrapped in a mystery, inside an enigma, that great unknown, also known as...Customer.

Centuries ago, cartographers used the Latin phrase terra incognita (meaning “unknown land”) to mark regions on a map not yet fully explored.  In this century, companies simply can not afford to use the phrase customer incognita to indicate what information about their existing (and prospective) customers they don't currently have or don't properly understand.

 

What is a Customer?

First things first, what exactly is a customer?  Those happy people who give you money?  Those angry people who yell at you on the phone or say really mean things about your company on Twitter and Facebook?  Why do they have to be so mean? 

Mean people suck.  However, companies who don't understand their customers also suck.  And surely you don't want to be one of those companies, do you?  I didn't think so.

Getting back to the question, here are some insights from the Data Quality Pro discussion forum topic What is a customer?:

  • Someone who purchases products or services from you.  The word “someone” is key because it’s not the role of a “customer” that forms the real problem, but the precision of the term “someone” that causes challenges when we try to link other and more specific roles to that “someone.”  These other roles could be contract partner, payer, receiver, user, owner, etc.
  • Customer is a role assigned to a legal entity in a complete and precise picture of the real world.  The role is established when the first purchase is accepted from this real-world entity.  Of course, the main challenge is whether or not the company can establish and maintain a complete and precise picture of the real world.

These working definitions were provided by fellow blogger and data quality expert Henrik Liliendahl Sørensen, who recently posted 360° Business Partner View, which further examines the many different ways a real-world entity can be represented, including when, instead of a customer, the real-world entity represents a citizen, patient, member, etc.

A critical first step for your company is to develop your definition of a customer.  Don't underestimate either the importance or the difficulty of this process.  And don't assume it is simply a matter of semantics.

Some of my consulting clients have indignantly told me: “We don't need to define it, everyone in our company knows exactly what a customer is.”  I usually respond: “I have no doubt that everyone in your company uses the word customer, however I will work for free if everyone defines the word customer in exactly the same way.”  So far, I haven't had to work for free.  

 

How Many Customers Do You Have?

You have done the due diligence and developed your definition of a customer.  Excellent!  Nice work.  Your next challenge is determining how many customers you have.  Hopefully, you are not going to try using any of these techniques:

  • SELECT COUNT(*) AS "We have this many customers" FROM Customers
  • SELECT COUNT(DISTINCT Name) AS "No wait, we really have this many customers" FROM Customers
  • Middle-Square or Blum Blum Shub methods (i.e. random number generation)
  • Magic 8-Ball says: “Ask again later”

One of the most common and challenging data quality problems is the identification of duplicate records, especially redundant representations of the same customer information within and across systems throughout the enterprise.  The need for a solution to this specific problem is one of the primary reasons that companies invest in data quality software and services.

Earlier this year on Data Quality Pro, I published a five part series of articles on identifying duplicate customers, which focused on the methodology for defining your business rules and illustrated some of the common data matching challenges.

Topics covered in the series:

  • Why a symbiosis of technology and methodology is necessary when approaching this challenge
  • How performing a preliminary analysis on a representative sample of real data prepares effective examples for discussion
  • Why using a detailed, interrogative analysis of those examples is imperative for defining your business rules
  • How both false negatives and false positives illustrate the highly subjective nature of this problem
  • How to document your business rules for identifying duplicate customers
  • How to set realistic expectations about application development
  • How to foster a collaboration of the business and technical teams throughout the entire project
  • How to consolidate identified duplicates by creating a “best of breed” representative record

To read the series, please follow these links:

To download the associated presentation (no registration required), please follow this link: OCDQ Downloads

 

Conclusion

“Knowing the characteristics of your customers,” stated Jill Dyché and Evan Levy in the opening chapter of their excellent book, Customer Data Integration: Reaching a Single Version of the Truth, “who they are, where they are, how they interact with your company, and how to support them, can shape every aspect of your company's strategy and operations.  In the information age, there are fewer excuses for ignorance.”

For companies of every size and within every industry, customer incognita is a crippling condition that must be replaced with customer cognizance in order for the company to continue to remain competitive in a rapidly changing marketplace.

Do you know your customers?  If not, then they likely aren't your customers anymore.

Poor Quality Data Sucks

Fenway Park 2008 Home Opener

Over the last few months on his Information Management blog, Steve Miller has been writing posts inspired by a great 2008 book that we both highly recommend: The Drunkard's Walk: How Randomness Rules Our Lives by Leonard Mlodinow.

In his most recent post The Demise of the 2009 Boston Red Sox: Super-Crunching Takes a Drunkard's Walk, Miller takes on my beloved Boston Red Sox and the less than glorious conclusion to their 2009 season. 

For those readers who are not baseball fans, the Los Angeles Angels of Anaheim swept the Red Sox out of the playoffs.  I will let Miller's words describe their demise: “Down two to none in the best of five series, the Red Sox took a 6-4 lead into the ninth inning, turning control over to impenetrable closer Jonathan Papelbon, who hadn't allowed a run in 26 postseason innings.  The Angels, within one strike of defeat on three occasions, somehow managed a miracle rally, scoring 3 runs to take the lead 7-6, then holding off the Red Sox in the bottom of the ninth for the victory to complete the shocking sweep.”

 

Baseball and Data Quality

What, you may be asking, does baseball have to do with data quality?  Beyond simply being two of my all-time favorite topics, quite a lot actually.  Baseball data is mostly transaction data describing the statistical events of games played.

Statistical analysis has been a beloved pastime even longer than baseball has been America's Pastime.  Number-crunching is far more than just a quantitative exercise in counting.  The qualitative component of statistics – discerning what the numbers mean, analyzing them to discover predictive patterns and trends – is the very basis of data-driven decision making.

“The Red Sox,” as Miller explained, “are certainly exemplars of the data and analytic team-building methodology” chronicled in Moneyball: The Art of Winning an Unfair Game, the 2003 book by Michael Lewis.  Red Sox General Manager Theo Epstein has always been an advocate of the so-called evidenced-based baseball, or baseball analytics, pioneered by Bill James, the baseball writer, historian, statistician, current Red Sox consultant, and founder of Sabermetrics.

In another book that Miller and I both highly recommend, Super Crunchers, author Ian Ayres explained that “Bill James challenged the notion that baseball experts could judge talent simply by watching a player.  James's simple but powerful thesis was that data-based analysis in baseball was superior to observational expertise.  James's number-crunching approach was particular anathema to scouts.” 

“James was baseball's herald,” continues Ayres, “of data-driven decision making.”

 

The Drunkard's Walk

As Mlodinow explains in the prologue: “The title The Drunkard's Walk comes from a mathematical term describing random motion, such as the paths molecules follow as they fly through space, incessantly bumping, and being bumped by, their sister molecules.  The surprise is that the tools used to understand the drunkard's walk can also be employed to help understand the events of everyday life.”

Later in the book, Mlodinow describes the hidden effects of randomness by discussing how to build a mathematical model for the probability that a baseball player will hit a home run: “The result of any particular at bat depends on the player's ability, of course.  But it also depends on the interplay of many other factors: his health, the wind, the sun or the stadium lights, the quality of the pitches he receives, the game situation, whether he correctly guesses how the pitcher will throw, whether his hand-eye coordination works just perfectly as he takes his swing, whether that brunette he met at the bar kept him up too late, or the chili-cheese dog with garlic fries he had for breakfast soured his stomach.”

“If not for all the unpredictable factors,” continues Mlodinow, “a player would either hit a home run on every at bat or fail to do so.  Instead, for each at bat all you can say is that he has a certain probability of hitting a home run and a certain probability of failing to hit one.  Over the hundreds of at bats he has each year, those random factors usually average out and result in some typical home run production that increases as the player becomes more skillful and then eventually decreases owing to the same process that etches wrinkles in his handsome face.  But sometimes the random factors don't average out.  How often does that happen, and how large is the aberration?”

 

Conclusion

I have heard some (not Mlodinow or anyone else mentioned in this post) argue that data quality is an irrelevant issue.  The basis of their argument is that poor quality data are simply random factors that, in any data set of statistically significant size, will usually average out and therefore have a negligible effect on any data-based decisions. 

However, the random factors don't always average out.  It is important to not only measure exactly how often poor quality data occur, but acknowledge the large aberration poor quality data are, especially in data-driven decision making.

As every citizen of Red Sox Nation is taught from birth, the only acceptable opinion of our American League East Division rivals, the New York Yankees, is encapsulated in the chant heard throughout the baseball season (and not just at Fenway Park):

“Yankees Suck!”

From their inception, the day-to-day business decisions of every organization are based on its data.  This decision-critical information drives the operational, tactical, and strategic initiatives essential to the enterprise's mission to survive and thrive in today's highly competitive and rapidly evolving marketplace. 

It doesn't quite roll off the tongue as easily, but a chant heard throughout these enterprise information initiatives is:

“Poor Quality Data Sucks!”

Books Recommended by Red Sox Nation

Mind Game: How the Boston Red Sox Got Smart, Won a World Series, and Created a New Blueprint for Winning

Feeding the Monster: How Money, Smarts, and Nerve Took a Team to the Top

Theology: How a Boy Wonder Led the Red Sox to the Promised Land

Now I Can Die in Peace: How The Sports Guy Found Salvation Thanks to the World Champion (Twice!) Red Sox

Adventures in Data Profiling (Part 7)

In Part 6 of this seriesYou completed your initial analysis of the Account Number and Tax ID fields. 

Previously during your adventures in data profiling, you have looked at customer name within the context of other fields.  In Part 2, you looked at the associated customer names during drill-down analysis on the Gender Code field while attempting to verify abbreviations as well as assess NULL and numeric values.  In Part 6, you investigated customer names during drill-down analysis for the Account Number and Tax ID fields while assessing the possibility of duplicate records. 

In Part 7 of this award-eligible series, you will complete your initial analysis of this data source with direct investigation of the Customer Name 1 and Customer Name 2 fields.

 

Previously, the data profiling tool provided you with the following statistical summaries for customer names:

Customer Name Summary

As we discussed when we looked at the E-mail Address field (in Part 3) and the Postal Address Line fields (in Part 5), most data profiling tools will provide the capability to analyze fields using formats that are constructed by parsing and classifying the individual values within the field.

Customer Name 1 and Customer Name 2 are additional examples of the necessity of this analysis technique.  Not only are the cardinality of these fields very high, but they also have a very high Distinctness (i.e. the exact same field value rarely occurs on more than one record).

 

Customer Name 1

The data profiling tool has provided you the following drill-down “screen” for Customer Name 1:

Field Formats for Customer Name 1 

Please Note: The differentiation between given and family names has been based on our fictional data profiling tool using probability-driven non-contextual classification of the individual field values. 

For example, Harris, Edward, and James are three of the most common names in the English language, and although they can also be family names, they are more frequently given names.  Therefore, “Harris Edward James” is assigned “Given-Name Given-Name Given-Name” for a field format.  For this particular example, how do we determine the family name?

The top twenty most frequently occurring field formats for Customer Name 1 collectively account for over 80% of the records with an actual value in this field for this data source.  All of these field formats appear to be common potentially valid structures.  Obviously, more than one sample field value would need to be reviewed using more drill-down analysis. 

What conclusions, assumptions, and questions do you have about the Customer Name 1 field?

 

Customer Name 2

The data profiling tool has provided you the following drill-down “screen” for Customer Name 2:

Field Formats for Customer Name 2 

The top ten most frequently occurring field formats for Customer Name 2 collectively account for over 50% of the records with an actual value in this sparsely populated field for this data source.  Some of these field formats show common potentially valid structures.  Again, more than one sample field value would need to be reviewed using more drill-down analysis.

What conclusions, assumptions, and questions do you have about the Customer Name 2 field?

 

The Challenges of Person Names

Not that business names don't have their own challenges, but person names present special challenges.  Many data quality initiatives include the business requirement to parse, identify, verify, and format a “valid” person name.  However, unlike postal addresses where country-specific postal databases exist to support validation, no such “standards” exist for person names.

In his excellent book Viral Data in SOA: An Enterprise Pandemic, Neal A. Fishman explains that “a person's name is a concept that is both ubiquitous and subject to regional variations.  For example, the cultural aspects of an individual's name can vary.  In lieu of last name, some cultures specify a clan name.  Others specify a paternal name followed by a maternal name, or a maternal name followed by a paternal name; other cultures use a tribal name, and so on.  Variances can be numerous.”

“In addition,” continues Fishman, “a name can be used in multiple contexts, which might affect what parts should or could be communicated.  An organization reporting an employee's tax contributions might report the name by using the family name and just the first letter (or initial) of the first name (in that sequence).  The same organization mailing a solicitation might choose to use just a title and a family name.”

However, it is not a simple task to identify what part of a person's name is the family name or the first given name (as some of the above data profiling sample field values illustrate).  Again, regional, cultural, and linguistic variations can greatly complicate what at first may appear to be a straightforward business request (e.g. formatting a person name for a mailing label).

As Fishman cautions, “many regions have cultural name profiles bearing distinguishing features for words, sequences, word frequencies, abbreviations, titles, prefixes, suffixes, spelling variants, gender associations, and indications of life events.”

If you know of any useful resources for dealing with the challenges of person names, then please share them by posting a comment below.  Additionally, please share your thoughts and experiences regarding the challenges (as well as useful resources) associated with business names.

 

What other analysis do you think should be performed for customer names?

 

In Part 8 of this series:  We will conclude the adventures in data profiling with a summary of the lessons learned.

 

Related Posts

Adventures in Data Profiling (Part 1)

Adventures in Data Profiling (Part 2)

Adventures in Data Profiling (Part 3)

Adventures in Data Profiling (Part 4)

Adventures in Data Profiling (Part 5)

Adventures in Data Profiling (Part 6)

Getting Your Data Freq On

DQ-Tip: “...Go talk with the people using the data”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“In order for your data quality initiative to be successful, you must:

Walk away from the computer and go talk with the people using the data.”

This DQ-Tip came from the TDWI World Conference Chicago 2009 presentation Modern Data Quality Techniques in Action by Gian Di Loreto from Loreto Services and Technologies.

As I blogged about in Data Gazers (borrowing that excellent phrase from Arkady Maydanchik), within cubicles randomly dispersed throughout the sprawling office space of companies large and small, there exist countless unsung heroes of data quality initiatives.  Although their job titles might be labeling them as a Business Analyst, Programmer Analyst, Account Specialist or Application Developer, their true vocation is a far more noble calling.  They are Data Gazers.

A most bizarre phenomenon (that I have witnessed too many times) is that as a data quality initiative “progresses” it tends to get further and further away from the people who use the data on a daily basis.

Please follow the excellent advice of Gian and Arkady — go talk with your users. 

Trust me — everyone on your data quality initiative will be very happy that you did.

 

Related Posts

DQ-Tip: “Data quality is primarily about context not accuracy...”

DQ-Tip: “Don't pass bad data on to the next person...”

Poor Data Quality is a Virus

“A storm is brewing—a perfect storm of viral data, disinformation, and misinformation.” 

These cautionary words (written by Timothy G. Davis, an Executive Director within the IBM Software Group) are from the foreword of the remarkable new book Viral Data in SOA: An Enterprise Pandemic by Neal A. Fishman.

“Viral data,” explains Fishman, “is a metaphor used to indicate that business-oriented data can exhibit qualities of a specific type of human pathogen: the virus.  Like a virus, data by itself is inert.  Data requires software (or people) for the data to appear alive (or actionable) and cause a positive, neutral, or negative effect.”

“Viral data is a perfect storm,” because as Fishman explains, it is “a perfect opportunity to miscommunicate with ubiquity and simultaneity—a service-oriented pandemic reaching all corners of the enterprise.”

“The antonym of viral data is trusted information.”

Data Quality

“Quality is a subjective term,” explains Fishman, “for which each person has his or her own definition.”  Fishman goes on to quote from many of the published definitions of data quality, including a few of my personal favorites:

  • David Loshin: “Fitness for use—the level of data quality determined by data consumers in terms of meeting or beating expectations.”
  • Danette McGilvray: “The degree to which information and data can be a trusted source for any and/or all required uses.  It is having the right set of correct information, at the right time, in the right place, for the right people to use to make decisions, to run the business, to serve customers, and to achieve company goals.”
  • Thomas Redman: “Data are of high quality if those who use them say so.  Usually, high-quality data must be both free of defects and possess features that customers desire.”

Data quality standards provide a highest common denominator to be used by all business units throughout the enterprise as an objective data foundation for their operational, tactical, and strategic initiatives.  Starting from this foundation, information quality standards are customized to meet the subjective needs of each business unit and initiative.  This approach leverages a consistent enterprise understanding of data while also providing the information necessary for day-to-day operations.

However, the enterprise-wide data quality standards must be understood as dynamic.  Therefore, enforcing strict conformance to data quality standards can be self-defeating.  On this point, Fishman quotes Joseph Juran: “conformance by its nature relates to static standards and specification, whereas quality is a moving target.”

Defining data quality is both an essential and challenging exercise for every enterprise.  “While a succinct and holistic single-sentence definition of data quality may be difficult to craft,” explains Fishman, “an axiom that appears to be generally forgotten when establishing a definition is that in business, data is about things that transpire during the course of conducting business.  Business data is data about the business, and any data about the business is metadata.  First and foremost, the definition as to the quality of data must reflect the real-world object, concept, or event to which the data is supposed to be directly associated.”

 

Data Governance

“Data governance can be used as an overloaded term,” explains Fishman, and he quotes Jill Dyché and Evan Levy to explain that “many people confuse data quality, data governance, and master data management.” 

“The function of data governance,” explains Fishman, “should be distinct and distinguishable from normal work activities.” 

For example, although knowledge workers and subject matter experts are necessary to define the business rules for preventing viral data, according to Fishman, these are data quality tasks and not acts of data governance. 

However,  these data quality tasks must “subsequently be governed to make sure that all the requisite outcomes comply with the appropriate controls.”

Therefore, according to Fishman, “data governance is a function that can act as an oversight mechanism and can be used to enforce controls over data quality and master data management, but also over data privacy, data security, identity management, risk management, or be accepted in the interpretation and adoption of regulatory requirements.”

 

Conclusion

“There is a line between trustworthy information and viral data,” explains Fishman, “and that line is very fine.”

Poor data quality is a viral contaminant that will undermine the operational, tactical, and strategic initiatives essential to the enterprise's mission to survive and thrive in today's highly competitive and rapidly evolving marketplace. 

Left untreated or unchecked, this infectious agent will negatively impact the quality of business decisions.  As the pathogen replicates, more and more decision-critical enterprise information will be compromised.

According to Fishman, enterprise data quality requires a multidisciplinary effort and a lifetime commitment to:

“Prevent viral data and preserve trusted information.”

Books Referenced in this Post

Viral Data in SOA: An Enterprise Pandemic by Neal A. Fishman

Enterprise Knowledge Management: The Data Quality Approach by David Loshin

Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information by Danette McGilvray

Data Quality: The Field Guide by Thomas Redman

Juran on Quality by Design: The New Steps for Planning Quality into Goods and Services by Joseph Juran

Customer Data Integration: Reaching a Single Version of the Truth by Jill Dyché and Evan Levy

 

Related Posts

DQ-Tip: “Don't pass bad data on to the next person...”

The Only Thing Necessary for Poor Data Quality

Hyperactive Data Quality (Second Edition)

The General Theory of Data Quality

Data Governance and Data Quality

DQ-Tip: “Don't pass bad data on to the next person...”

Data Quality (DQ) Tips is a new regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“Don't pass bad data on to the next person.  And don't accept bad data from the previous person.”

This DQ-Tip is from Thomas Redman's excellent book Data Driven: Profiting from Your Most Important Business Asset.

In the book, Redman explains that this advice is a rewording of his favorite data quality policy of all time.

Assuming that it is someone else's responsibility is a fundamental root case for enterprise data quality problems.  One of the primary goals of a data quality initiative must be to define the roles and responsibilities for data ownership and data quality.

In sports, it is common for inspirational phrases to be posted above every locker room exit door.  Players acknowledge and internalize the inspirational phrase by reaching up and touching it as they head out onto the playing field.

Perhaps you should post this DQ-Tip above every break room exit door throughout your organization?

 

Related Posts

The Only Thing Necessary for Poor Data Quality

Hyperactive Data Quality (Second Edition)

Data Governance and Data Quality

 

Additional Resources

Who is responsible for data quality?

DQ Problems? Start a Data Quality Recognition Program!

Starting Your Own Personal Data Quality Crusade

The Fragility of Knowledge

In his excellent book The Black Swan: The Impact of the Highly Improbable, Nassim Nicholas Taleb explains:

“What you don’t know is far more relevant than what you do know.”

Our tendency is to believe the opposite.  After we have accumulated the information required to be considered knowledgeable in our field, we believe that what we have learned and experienced (i.e. what we know) is far more relevant than what we don’t know.  We are all proud of our experience, which we believe is the path that separates knowledge from wisdom.

“We tend to treat our knowledge as personal property to be protected and defended,” explains Taleb.  “It is an ornament that allows us to rise in the pecking order.  We take what we know a little too seriously.”

However, our complacency is all too often upset by the unexpected.  Some new evidence is discovered that disproves our working theory of how things work.  Or something that we have repeatedly verified in the laboratory of our extensive experience, suddenly doesn’t produce the usual results.

Taleb cautions that this “illustrates a severe limitation to our learning from experience and the fragility of our knowledge.”

I have personally encountered this many times throughout my career in data quality.  At first, it seemed like a cruel joke or some bizarre hazing ritual.  Every time I thought that I had figured it all out, that I had learned all the rules, something I didn’t expect would come along and smack me upside the head.

“We do not spontaneously learn,” explains Taleb, “that we don’t learn that we don’t learn.  The problem lies in the structure of our minds: we don’t learn rules, just facts, and only facts.”

Facts are important.  Facts are useful.  However, sometimes our facts are really only theories.  Mistaking a theory for a fact can be very dangerous.  What you don’t know can hurt you. 

However, as Taleb explains, “what you know cannot really hurt you.”  Therefore, we tend to only “look at what confirms our knowledge, not our ignorance.”  This is unfortunate, because “there are so many things we can do if we focus on antiknowledge, or what we do not know.”

This is why, as a data quality consultant, when I begin an engagement with a new client, I usually open with the statement (completely without sarcasm):

“Tell me something that I don’t know.” 

Related Posts

Hailing Frequencies Open