Recently Read: December 21, 2009

Recently Read is an OCDQ regular segment.  Each entry provides links to blog posts, articles, books, and other material I found interesting enough to share.  Please note “recently read” is literal – therefore what I share wasn't necessarily recently published.

 

Data Quality

For simplicity, “Data Quality” also includes Data Governance, Master Data Management, and Business Intelligence.

  • Welcome to DQ Directions – In this blog post, Dylan Jones of Data Quality Pro formally announced the DQ Directions online conference, which will debut in Q2 2010, and will feature presentations from experts and industry thought leaders specializing in data quality, data governance, and master data management.

     

  • Ways to 'Communivate' your Data Issues – In her Purple Cow of a blog post, Jill Wanless (aka Sheezaredhead) explains that ‘Communivate’ is a combination of the words communicate and innovate, and it means to communicate in an innovative way, which she does regarding the importance of data quality.

     

  • ’Tis the Season for a Data Governance Carol – Part 1 and Part 2 – In his excellent two-part series, Rob Paller of Baseline Consulting uses a Dickensian framework to explain the importance of data governance and data quality – and the fact that there isn’t a simple framework to blindly follow for Data Governance.

     

  • The “Santa Intelligence” Team – An excellent Christmas-themed blog post from Paul Boal, in which we learn that Santa does indeed have a Business Intelligence team.

     

  • Data quality is for life not just for Christmas – In this Diary of a Marketing Insight Guy blog post, Simon Daniels reminds us data quality can be a gift that will keep on giving—if data quality management is built into the heart of an organization’s processes and operations.

     

  • Finding a home for MDM – In his second post on the DataFlux Community of Experts, Charles Blyth examines where master data management (MDM) fits within your overall enterprise architecture.

     

  • The Decade of Data: Seven Trends to Watch in 2010 – In his blog post on Informatica Perspectives, Joe McKendrick examines some up-and-coming trends that he predicts will shape the data management space in 2010.

     

  • Are we ready for all this data? – In his blog post, Rich Murnane uses some recent news stories to ponder if even us experienced data geeks are really ready for the amount of data we're going to need to manage due to the unrelenting increases in data volumes.

 

Social Media

For simplicity, “Social Media” also includes Blogging, Writing, Social Networking, and Online Marketing.

 

Book Quotes

An eclectic list of quotes from some recently read (and/or simply my favorite) books.

  • From Crush It! by Gary Vaynerchuk – “Your business and your personal brand need to be one and the same...Your latest tweet and comment on Facebook and most recent blog post—that's your résumé now...It's a whole new world, build your personal brand and get ready for it.”

     

  • From A Whole New Mind by Daniel Pink – “Empathy is neither a deviation from intelligence nor the single route to it.  Sometimes we need detachment; many other times we need attachment.  The people who will thrive will be those who can toggle between the two.” 

     

  • From Connected by Nicholas Christakis and James Fowler – “Just as brains can do things that no single neuron can do, so can social networks do things that no single person can do...our connections to other people matter...most of all it is about what makes us uniquely human...To know who we are, we must understand how we are connected.”

Podcast: Stand-Up Data Quality

December—the last month of the year when we hustle and bustle to finish our work, while visions of sugar-plums dance in our holiday shopping heads.  During this time of year, little attention (and rightfully so) is paid to the blogosphere—especially the neither naughty nor nice, but simply niche-y corners of the blogosphere.

As I have often joked, data quality is not just a niche – if technology blogging was a Matryoshka (a.k.a. Russian nested) doll, then data quality would be the last, innermost doll.  This doesn't mean that data quality isn't an important subject – it just means its extra-niche-y-ness all but guarantees December (and usually January and most of February too) will be a very cold month – when all niche blogs struggle to rub two random RSS readers together in order to start a cozy fire, keeping them warm until their blogging hope springs eternal once again come springtime.

Niche blogs can either shutdown during this blogging lull, or use it as an opportunity to experiment.  I have chosen the latter, which explains why four of my last six blog posts have used either a Podcast or a Video

Not to worry though, I haven't given up writing more “traditional” blog posts.  I simply plan to use more podcasts and videos in 2010 as a way to add more variety (and more of a personal touch) to my blog content.  They may not appear as frequently as they have recently, but more is to come in the new year.  For now, I am experimenting with how best to produce them.

 

Stand-Up Data Quality

In this OCDQ Podcast, I discuss using humor to enliven a niche topic, and revisit some of the stand-up comedy aspects of some of my favorite written-down blog posts from earlier this year.

Humor can be a great way to start a conversation and hold your readers' attention for those few precious additional seconds while you are getting to your point.  Obviously, there will be times when the seriousness of your subject would make comedy inappropriate, and if you are not naturally inclined to use humor, then you shouldn't try to force it.

 

You can also download this podcast (MP3 file) by clicking on this link: Stand-Up Data Quality

 

Related Posts

The Tell-Tale Data

Data Quality: The Reality Show?

Data Quality is People!

All I Really Need To Know About Data Quality I Learned In Kindergarten

The Mullet Blogging Manifesto

Video: The DQ General's Song

In this OCDQ Video, I revisit The Very Model of a Modern DQ General, which was the second post ever published on this blog.

Using The Major-General's Song from The Pirates of Penzance by Gilbert and Sullivan as a framework, I encapsulated into lyrics some of the knowledge I have accumulated from over 15 years of experience in the data quality profession.  The intended result was a comical delivery of serious insight.

I recorded a video and not simply a podcast so that you could follow along with the lyrics.  However, my budget couldn't afford the inclusion of the “follow the bouncing ball” technology I enjoyed in many of my favorite childhood cartoons. 

Sparing you the pain of listening to me actually sing, I instead offer for your amusement, my recital of The DQ General's Song:

 

If you are reading this blog post via e-mail or a feed reader, then to view this video, please click on this link: OCDQ Video

 

Related Posts

The Very Model of a Modern DQ General

Imagining the Future of Data Quality

Data Quality is Sexy

‘Twas Two Weeks Before Christmas

‘Twas two weeks before Christmas, and all about the data warehouse,
Every employee was stirring, busy clicking their mouse;
The stockings were hung on our cubicle walls with care,
In hopes that year-end bonus checks soon would be there.

The data were nestled all snug in their test beds,
While visions of sugar-plums danced in DBA's heads; 
Working together, the Business and IT, for collaboration is best,
All had just settled in, for a winter night's long, pre-production test.

When out in the parking lot there arose such a clatter,
We all sprang from our desk chairs to see what was the matter;
Away to the window we flew like a flash,
Tore open the shutters and threw up the sash.

The moon on the crest of the new-fallen snow,
Gave the luster of mid-day to objects below;
When, what to our wondering eyes should appear?

The Big Boss Man dressed up as Santa,
Carrying eight tiny candles, to Light the Menorah.

We descended the stairs to the lobby, so lively and quick,
We wanted to know in mere moments, if this was some trick;
The Big Boss Man greeted us, as into the lobby we all did file,
He whistled, and shouted, then gave us a big grinning smile.

He was dressed all in faux fur, from his head to his toes,
And his clothes were well-tailored with buttons and bows;
A bundle of bonus checks he had flung on his back,
We were as giddy as young children as he opened the sack.

His eyes—how they twinkled, his dimples how merry!
His cheeks were like roses, his nose like a cherry!
His droll little mouth was drawn up like a bow,
And the beard of his chin was as white as the snow.

The stump of a pipe he held tight in his teeth,
And the smoke it encircled his head like a wreath;
He had a broad face and a little round belly,
That shook when he laughed, like a bowlful of jelly.

He was chubby and plump, a right jolly old elf,
And we laughed when we saw him, in spite of ourselves;
A wink of his eye and a twist of his head,
Soon gave us to know, we had nothing to dread.

And these were the words that carefully he said:

“Whether you celebrate Christmas or Hanukkah, Kwanzaa or Festivus,
Whether for you, these are Holy Days or holidays, or simply a rest for us,
My words are the same, and they are just as bright:

Peace, Love, and Happiness to All,
And to all—A Good Night.”

To you and yours, from the entire OCDQ Blog family.

Video: Twitter Search Tutorial

In this OCDQ Video, I provide a brief tutorial on Twitter Search.

Key points about Twitter Search covered in the video tutorial:

  • Unlike other social networking sites (e.g., Facebook, LinkedIn), you don't need an account for read access to Twitter content
  • This is a safe way for you or your company to start leveraging Twitter for “listening purposes only”
  • You can save Twitter Search queries as RSS feeds (e.g., for viewing within Google Reader)

 

If you are reading this blog post via e-mail or a feed reader, then to view this video, please click on this link: OCDQ Video

 

For more help finding data quality content on Twitter, click on this link: Data Quality on Twitter

 

Related Posts

Live-Tweeting: Data Governance

Brevity is the Soul of Social Media

If you tweet away, I will follow

Tweet 2001: A Social Media Odyssey

Recently Read: December 7, 2009

Recently Read is an OCDQ regular segment.  Each entry provides links to blog posts, articles, books, and other material I found interesting enough to share.  Please note “recently read” is literal – therefore what I share wasn't necessarily recently published.

 

Data Quality

For simplicity, “Data Quality” also includes Data Governance, Master Data Management, and Business Intelligence.

  • Data Quality Blog Roundup - November 2009 Edition – Dylan Jones at Data Quality Pro always provides a great collection of the previous month's best blog posts, which covers most of the my “recently reads” for data quality.

     

  • The value of Christmas cards – In this Data Value Talk blog post from Human Inference, we learn about how sending Christmas cards can optimize your data quality.

     

  • Santa Quality – Yes, Virginia, there is a Santa Claus—as well as a Saint Nicholas, a Père Noël, a Weihnachtsmann, and a Julemand.  In this blog post, Henrik Liliendahl Sørensen explains some ho-ho-holiday data quality issues.

     

  • Some TLC for Your Data – Data really needs some tender loving care.  Daniel Gent explains in his latest blog post.

     

  • Determining data quality is the first key step – In the second part of a blog series on data migration, James Standen explains that a data migration project will be required to actually improve data quality at the same time, and therefore it is really two projects in one.  The post contains the great line: “data quality sense tingling.”

     

  • Data Chaos and Five Truisms of Data Quality – In his debut post on the DataFlux Community of Experts, my good friend Phil Simon provides a quick case study and five universal truths of data quality.

 

Social Media

For simplicity, “Social Media” also includes Blogging, Writing, Social Networking, and Online Marketing.

 

Awesome Stuff

An eclectic list of articles, blog posts, and other “non-data quality, non-social media, but still awesome” stuff.

  • The Greatest Book Of All Time? – Josh Hanagarne (a.k.a. the “World’s Strongest Librarian”) recently reviewed a book he received from Ethan.  Josh has a simple philosophy of life — “Don’t make anyone’s day worse” — if you are having a bad day (like I was the day I found this), then check this out.

     

  • Cute Apple parody from The Sun – Rob Beschizza on Boing Boing shares a great one minute video of a recent commercial from The Sun about “The UK's best handheld for 40 years.”


Podcast: Your Blog, Your Voice

In this OCDQ Podcast, I discuss the importance of blogging in your own voice. 

The best way to produce unique content is to let your blogging style reflect your personality.  Make your readers feel like they are having a conversation with a real person – not just someone who is blogging what they think people want to read.

Your Blog, Your Voice

 

You can also download this podcast (MP3 file) by clicking on this link: Your Blog, Your Voice

 

Related Posts

The Mullet Blogging Manifesto

Collablogaunity

Brevity is the Soul of Social Media

Live-Tweeting: Data Governance

The term “live-tweeting” describes using Twitter to provide near real-time reporting from an event.  I live-tweet from the sessions I attend at industry conferences as well as interesting webinars.

Recently, I live-tweeted Successful Data Stewardship Through Data Governance, which was a data governance webinar featuring Marty Moseley of Initiate Systems and Jill Dyché of Baseline Consulting.

Instead of writing a blog post summarizing the webinar, I thought I would list my tweets with brief commentary.  My goal is to provide an example of this particular use of Twitter so you can decide its value for yourself.

 

As the webinar begins, Marty Moseley and Jill Dyché provide some initial thoughts on data governance:

Live-Tweets 1

 

Jill Dyché provides a great list of data governance myths and facts:

Live-Tweets 2

 

Jill Dyché provides some data stewardship insights:

Live-Tweets 3

 

As the webinar ends, Marty Moseley and Jill Dyché provide some closing thoughts about data governance and data quality:

Live-Tweets 4

 

Please Share Your Thoughts

If you attended the webinar, then you know additional material was presented.  Did my tweets do the webinar justice?  Did you follow along on Twitter during the webinar?  If you did not attend the webinar, then are these tweets helpful?

What are your thoughts in general regarding the pros and cons of live-tweeting? 

 

Related Posts

The following three blog posts are conference reports based largely on my live-tweets from the events:

Enterprise Data World 2009

TDWI World Conference Chicago 2009

DataFlux IDEAS 2009

Data Quality is Sexy

 

Jim Harris 017

I am sick and tired of hearing people talk about how data quality (DQ) is not sexy.

I was talking with my friend J.T. the other day and he told me I simply needed to remind people data quality has always been sexy.  Sometimes, people just have a tendency to forget. 

J.T. told me:

“You know what you gotta do J.H.?  You gotta bring DQ Sexy back.”

True dat, J.T.

 

I'm Bringing DQ Sexy Back

 

Jim Harris 001

 

I’m bringing DQ Sexy back

All you naysayers, watch how I attack

I think your data’s special, why does your quality lack?

Grant me some access, and I’ll pick up the slack

 

 

Jim Harris 008

 

Dirty data – you see the problems everywhere

Let me be your data cleanser, and baby, I'll be there

We'll whip the Business Process if it misbehaves

But just remember – trying to be perfect – it's not the way

 

 

Jim Harris 005 

I’m bringing DQ Sexy back

Them non-team players don’t know how to act

Let our collaboration get us back on track

Working together, we'll make the right impact

 

 

Jim Harris 010

 

Look at that data – it's your 'prise asset 
Treat it well, and all your business needs will be met

Understanding it will really make you smile 
To get started, you really need to profile

There's no need for you to be afraid – come on 
Go ahead – get your data freak on

 

Jim Harris 014 

I’m bringing DQ Sexy back

Any non-believers left?  Don't make me give you a smack

If you have data, you'd better watch out for what it lacks

'Cause quality is what it needs – and that’s a fact

 

 

Data Quality is Sexy

Jim Harris 015

That’s right. 

Data Quality is Sexy. 

Always has been. 

Always will be.

True dat, J.H.

Fo real!

 

Adventures in Data Profiling (Part 8)

Understanding your data is essential to using it effectively and improving its quality – and to achieve these goals, there is simply no substitute for data analysis.  This post is the conclusion of a vendor-neutral series on the methodology of data profiling.

Data profiling can help you perform essential analysis such as:

  • Provide a reality check for the perceptions and assumptions you may have about the quality of your data
  • Verify your data matches the metadata that describes it
  • Identify different representations for the absence of data (i.e., NULL and other missing values)
  • Identify potential default values
  • Identify potential invalid values
  • Check data formats for inconsistencies
  • Prepare meaningful questions to ask subject matter experts

Data profiling can also help you with many of the other aspects of domain, structural and relational integrity, as well as determining functional dependencies, identifying redundant storage, and other important data architecture considerations.

 

Adventures in Data Profiling

This series was carefully designed as guided adventures in data profiling in order to provide the necessary framework for demonstrating and discussing the common functionality of data profiling tools and the basic methodology behind using one to perform preliminary data analysis.

In order to narrow the scope of the series, the scenario used was a customer data source for a new data quality initiative had been made available to an external consultant with no prior knowledge of the data or its expected characteristics.  Additionally, business requirements had not yet been documented, and subject matter experts were not currently available.

This series did not attempt to cover every possible feature of a data profiling tool or even every possible use of the features that were covered.  Both the data profiling tool and data used throughout the series were fictional.  The “screen shots” were customized to illustrate concepts and were not modeled after any particular data profiling tool.

This post summarizes the lessons learned throughout the series, and is organized under three primary topics:

  1. Counts and Percentages
  2. Values and Formats
  3. Drill-down Analysis

 

Counts and Percentages

One of the most basic features of a data profiling tool is the ability to provide counts and percentages for each field that summarize its content characteristics:

 Data Profiling Summary

  • NULL – count of the number of records with a NULL value 
  • Missing – count of the number of records with a missing value (i.e., non-NULL absence of data, e.g., character spaces) 
  • Actual – count of the number of records with an actual value (i.e., non-NULL and non-Missing) 
  • Completeness – percentage calculated as Actual divided by the total number of records 
  • Cardinality – count of the number of distinct actual values 
  • Uniqueness – percentage calculated as Cardinality divided by the total number of records 
  • Distinctness – percentage calculated as Cardinality divided by Actual

Completeness and uniqueness are particularly useful in evaluating potential key fields and especially a single primary key, which should be both 100% complete and 100% unique.  In Part 2, Customer ID provided an excellent example.

Distinctness can be useful in evaluating the potential for duplicate records.  In Part 6, Account Number and Tax ID were used as examples.  Both fields were less than 100% distinct (i.e., some distinct actual values occurred on more than one record).  The implied business meaning of these fields made this an indication of possible duplication.

Data profiling tools generate other summary statistics including: minimum/maximum values, minimum/maximum field sizes, and the number of data types (based on analyzing the values, not the metadata).  Throughout the series, several examples were provided, especially in Part 3 during the analysis of Birth Date, Telephone Number and E-mail Address.

 

Values and Formats

In addition to counts, percentages, and other summary statistics, a data profiling tool generates frequency distributions for the unique values and formats found within the fields of your data source.

A frequency distribution of unique values is useful for:

  • Fields with an extremely low cardinality, indicating potential default values (e.g., Country Code in Part 4)
  • Fields with a relatively low cardinality (e.g., Gender Code in Part 2)
  • Fields with a relatively small number of known valid values (e.g., State Abbreviation in Part 4)

A frequency distribution of unique formats is useful for:

  • Fields expected to contain a single data type and/or length (e.g., Customer ID in Part 2)
  • Fields with a relatively limited number of known valid formats (e.g., Birth Date in Part 3)
  • Fields with free-form values and a high cardinality (e.g., Customer Name 1 and Customer Name 2 in Part 7)

Cardinality can play a major role in deciding whether you want to be shown values or formats since it is much easier to review all of the values when there are not very many of them.  Alternatively, the review of high cardinality fields can also be limited to the most frequently occurring values, as we saw throughout the series (e.g., Telephone Number in Part 3).

Some fields can also be analyzed using partial values (e.g., in Part 3, Birth Year was extracted from Birth Date) or a combination of values and formats (e.g., in Part 6, Account Number had an alpha prefix followed by all numbers).

Free-form fields are often easier to analyze as formats constructed by parsing and classifying the individual values within the field.  This analysis technique is often necessary since not only is the cardinality of free-form fields usually very high, but they also tend to have a very high distinctness (i.e., the exact same field value rarely occurs on more than one record). 

Additionally, the most frequently occurring formats for free-form fields will often collectively account for a large percentage of the records with an actual value in the field.  Examples of free-form field analysis were the focal points of Part 5 and Part 7.

We also saw examples of how valid values in a valid format can have an invalid context (e.g., in Part 3, Birth Date values set in the future), as well as how valid field formats can conceal invalid field values (e.g., Telephone Number in Part 3).

Part 3 also provided examples (in both Telephone Number and E-mail Address) of how you should not mistake completeness (which as a data profiling statistic indicates a field is populated with an actual value) for an indication the field is complete in the sense that its value contains all of the sub-values required to be considered valid. 

 

Drill-down Analysis

A data profiling tool will also provide the capability to drill-down on its statistical summaries and frequency distributions in order to perform a more detailed review of records of interest.  Drill-down analysis will often provide useful data examples to share with subject matter experts.

Performing a preliminary analysis on your data prior to engaging in these discussions better facilitates meaningful dialogue because real-world data examples better illustrate actual data usage.  As stated earlier, understanding your data is essential to using it effectively and improving its quality.

Various examples of drill-down analysis were used throughout the series.  However, drilling all the way down to the record level was shown in Part 2 (Gender Code), Part 4 (City Name), and Part 6 (Account Number and Tax ID).

 

Conclusion

Fundamentally, this series posed the following question: What can just your analysis of data tell you about it?

Data profiling is typically one of the first tasks performed on a data quality initiative.  I am often told to delay data profiling until business requirements are documented and subject matter experts are available to answer my questions. 

I always disagree – and begin data profiling as soon as possible.

I can do a better job of evaluating business requirements and preparing for meetings with subject matter experts after I have spent some time looking at data from a starting point of blissful ignorance and curiosity.

Ultimately, I believe the goal of data profiling is not to find answers, but instead, to discover the right questions.

Discovering the right questions is a critical prerequisite for effectively discussing data usage, relevancy, standards, and the metrics for measuring and improving quality.  All of which are necessary in order to progress from just profiling your data, to performing a full data quality assessment (which I will cover in a future series on this blog).

A data profiling tool can help you by automating some of the grunt work needed to begin your analysis.  However, it is important to remember that the analysis itself can not be automated – you need to review the statistical summaries and frequency distributions generated by the data profiling tool and more important translate your analysis into meaningful reports and questions to share with the rest of your team. 

Always remember that well performed data profiling is both a highly interactive and a very iterative process.

 

Thank You

I want to thank you for providing your feedback throughout this series. 

As my fellow Data Gazers, you provided excellent insights and suggestions via your comments. 

The primary reason I published this series on my blog, as opposed to simply writing a whitepaper or a presentation, was because I knew our discussions would greatly improve the material.

I hope this series proves to be a useful resource for your actual adventures in data profiling.

 

The Complete Series


Recently Read: November 28, 2009

Recently Read is an OCDQ regular segment.  Each entry provides links to blog posts, articles, books, and other material I found interesting enough to share.  Please note “recently read” is literal – therefore what I share wasn't necessarily recently published.

 

Data Quality Blog Posts

For simplicity, “Data Quality” also includes Data Governance, Master Data Management, and Business Intelligence.

 

Social Media Blog Posts

For simplicity, “Social Media” also includes Blogging, Social Networking, and Online Marketing.

 

Book Quotes

An eclectic list of quotes from some recently read (and/or simply my favorite) books.

  • From The Wisdom of Crowds by James Surowiecki – “Refuse to allow the merit of an idea to be determined by the status of the person advocating it.”

     

  • From Purple Cow by Seth Godin – “We mistakenly believe that criticism leads to failure.”

     

  • From How We Decide by Jonah Lehrer – “The best decision-makers don't despair.  Instead, they become students of error, determined to learn from what went wrong.”

     

  • From The Whuffie Factor by Tara Hunt – “Whuffie is the residual outcome—the currency—of your reputation.  You lose or gain it based on positive or negative actions, your contributions to the community, and what people think of you.”

     

  • From Trust Agents by Chris Brogan and Julien Smith – “You accrue social capital as a side benefit of doing good, but doing good by itself is its own reward.”

Commendable Comments (Part 4)

Thanksgiving

Photo via Flickr (Creative Commons License) by: ella_marie 

Today is Thanksgiving Day, which is a United States holiday with a long and varied history.  The most consistent themes remain family and friends gathering together to share a large meal and express their gratitude.

This is the fourth entry in my ongoing series for expressing my gratitude to my readers for their truly commendable comments on my blog posts.  Receiving comments is the most rewarding aspect of my blogging experience.  Although I am truly grateful to all of my readers, I am most grateful to my commenting readers. 

 

Commendable Comments

On Days Without A Data Quality Issue, Steve Sarsfield commented:

“Data quality issues probably occur on some scale in most companies every day.  As long as you qualify what is and isn't a data quality issue, this gets back to what the company thinks is an acceptable level of data quality.

I've always advocated aggregating data quality scores to form business metrics.  For example, what data quality metrics would you combine to ensure that customers can always be contacted in case of an upgrade, recall or new product offering?  If you track the aggregation, it gives you more of a business feel.”

On Customer Incognita, Daragh O Brien commented:

“Back when I was with the phone company I was (by default) the guardian of the definition of a 'Customer'.  Basically I think they asked for volunteers to step forward and I was busy tying my shoelace when the other 11,000 people in the company as one entity took a large step backwards.

I found that the best way to get a definition of a customer was to lock the relevant stakeholders in a room and keep asking 'What' and 'Why'. 

My 'data modeling' methodology was simple.  Find out what the things were that were important to the business operation, define each thing in English without a reference to itself, and then we played the 'Yes/No Game Show' to figure out how that entity linked to other things and what the attributes of that thing were.

Much to IT's confusion, I insisted that the definition needed to be a living thing, not carved in two stone tablets we'd lug down from on top of the mountain. 

However, because of the approach that had been taken we found that when new requirements were raised (27 from one stakeholder), the model accommodated all of them either through an expansion of a description or the addition of a piece of reference data to part of the model.

Fast-forward a few months from the modeling exercise.  I was asked by IT to demo the model to a newly acquired subsidiary.  It was a significantly different business.  I played the 'Yes/No Game Show' with them for a day.  The model fitted their needs with just a minor tweak. 

The IT team from the subsidiary wanted to know how had I gone about normalizing the data to come up with the model, which is kind of like cutting up a perfectly good apple pie to find out how what an apple is and how to make pastry.

What I found about the 'Yes/No Game Show' approach was that it made people open up their thinking a bit, but it took some discipline and perseverance on my part to keep asking what and why.  Luckily, having spent most of the previous few years trying to get these people to think seriously about data quality they already thought I was a moron so they were accommodating to me.

A key learning for me out of the whole thing is that, even if you are doing a data management exercise for a part of a larger business, you need to approach it in a way that can be evolved and continuously improved to ensure quality across the entire organization. 

Also, it highlighted the fallacy of assuming that a company can only have one kind of customer.”

On The Once and Future Data Quality Expert, Dylan Jones commented:

“I recently attended a conference and sat in on a panel that discussed some of the future trends, such as cloud computing.  It was a great discussion, highly polarized, and as I came home I thought about how far we've come as a profession but more importantly, how much more there is to do.

The reality is that the world is changing, the volumes of data held by businesses are immense and growing exponentially, our desire for new forms of information delivery insatiable, and the opportunities for innovation boundless.

I really believe we're not innovating as an industry anything like we should be.  The cloud, as an example, offers massive opportunities for a range of data quality services but I've certainly not read anything in the media or press that indicates someone is capitalizing on this.

There are a few recent data quality technology innovations which have caught my eye, but I also think there is so much more vendors should be doing.

On the personal side of the profession, I think online education is where we're headed.  The concept of localized training is now being replaced by online learning.  With the Internet you can now train people on every continent, so why aren't more people going down this route?

I find it incredibly ironic when I speak to data quality specialists who admit that 'they don't have the first clue about all this social media stuff.'  This is the next generation of information management, it's here right now, they should be embracing it.  I think if you're a 'guru' author, trainer or consultant you need to think of new ways to engage with your clients/trainees using the tools available.

What worries me is that the growth of information doesn't match the maturity and growth of our profession.  For example, we really need more people who can articulate the value of what we can offer. 

Ted Friedman made a great point on Twitter recently when he talked about how people should stop moaning about executives that 'don't get it' and instead focus on improving ways to demonstrate the value of data quality improvement.

Just because we've come a long way doesn't mean we know it all, there is still a hell of a long way to go.”

Thanks for giving your comments

Thank you very much for giving your comments and sharing your perspectives with our collablogaunity.  Since there have been so many commendable comments, please don't be offended if your commendable comment hasn't been featured yet. 

Please keep on commenting and stay tuned for future entries in the series. 

 

Related Posts

Commendable Comments (Part 1)

Commendable Comments (Part 2)

Commendable Comments (Part 3)

DQ-Tip: “Data quality is about more than just improving your data...”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“Data quality is about more than just improving your data.

Ultimately, the goal is improving your organization.”

This DQ-Tip is from Tony Fisher's great book The Data Asset: How Smart Companies Govern Their Data for Business Success.

In the book, Fisher explains that one of the biggest mistakes organizations make is not viewing their data as a corporate asset.  This common misconception often prevents data quality from being rightfully viewed a critical priority. 

Data quality is misperceived to be an activity performed just for the sake of improving data.  When in fact, data quality is an activity performed for the sake of improving business processes.

“Better data leads to better decisions,” explains Fisher, “which ultimately leads to better business.  Therefore, the very success of your organization is highly dependent on the quality of your data.”

 

Related Posts

DQ-Tip: “...Go talk with the people using the data”

DQ-Tip: “Data quality is primarily about context not accuracy...”

DQ-Tip: “Don't pass bad data on to the next person...”

Brevity is the Soul of Social Media

“Why day is day, night night, and time is time,
Were nothing but to waste night, day and time.
Therefore, since brevity is the soul of wit,
I will be brief ...”

Within the wide world of social media, one of the most common features is some form of social networking, microblogging, or short message service that allow users to share brief status updates.  Some social media sites are almost entirely built on only this feature (e.g., Twitter) whereas others (e.g., Facebook, LinkedIn) include it among a list of many other features. 

Either way, these status updates have created a rather pithy platform many people argue is incompatible with meaningful communication, especially of a professional nature.  I must admit this was also my initial opinion of social media.

However, I now believe not only is it the soul of wit, brevity is the soul of social media – and, in fact, a very good soul.

 

Short Attention Span Theater

I doubt attention deficit will still be considered a disorder ten years from now.  We are living increasingly faster-paced lives in an increasingly faster-paced world.  The pervasiveness of the Internet and the rapid proliferation of powerful mobile technology is making our world a smaller and smaller place and our lives a more and more crowded space. 

We have become so accustomed to multi-tasking that the very concept of focusing our attention on only one thing at a time somehow seems inherently wrong to us.  All the world's a stage within this short attention span theater.  And all of us are not merely players, we have been cast in several simultaneous roles.

Time management has always been important, but nowadays it is even more essential.  This is especially true when it comes to social media, which, if we can effectively and efficiently use it, has great personal and professional potential.  Amber Naslund recently provided an excellent blog series on social media time management that I highly recommend.

 

The Power of Pith

I admit I am a long-winded talker or, as a favorite (canceled) television show would say, “conversationally anal-retentive.”  In the past (slightly less now), I was also known for e-mail messages even Leo Tolstoy would declare to be far too long.

Therefore, it may be surprising to learn I am addicted to Twitter.  How could I possibly constrain myself to only 140 characters?  No, I don't use ellipses to extend my thoughts across multiple tweets (although I admit I am often tempted to do so). 

I wholeheartedly agree with Jennifer Blanchard, who explained how Twitter makes you a better writer.  When forced to be concise, you have to focus on exactly what you want to say, using as few words as possible. 

The power of pith means reducing your message to its bare essence.  In order to engage in effective dialogue on the stage of our short attention span theater, this is a required skill we all must master – and not just when we are on Twitter.

For those who argue this simply regresses human communication back to our days of monosyllabic grunting, I invite you to read the excellent recent blog post Is Twitter a Complex Adaptive System? written by Venessa Miemis

Although you should read all of it, the point I need here will be found under Insight #4 toward the end of the post.  Miemis shares a study that reveals using Twitter can not only improve communication, but actually build intelligence. 

The collaborative communication enabled by social media platforms can actually contribute to a growing collective intelligence made up of all of us.  The power of pith is the wisdom of crowds.

 

Blogging with Brevity

Brevity is the soul of all social media and yes, this includes blogging as well.  Some view blogging as social media's last bastion of robust communication.  You can take your time and use all the words you want on your blog, right?  Sure, as long as you have no interest in anyone actually reading your blog.

Some bloggers get cranky with me when I emphasize the Three C’s – meaning your blog posts should be:

  1. Clear – Get to the point and stay on point
  2. Concise – No longer than necessary
  3. Consumable – Formatted to be easily read on a computer screen

Concise is usually the main cranky causing culprit because everyone interprets it to mean “write really short posts.” 

One blogger told me he has “never met a subordinate clause he didn't like,” thereby expressing his fondness for writing compound-complex sentences.  For the non-writers, this means really long (but grammatically correct) sentences oftentimes requiring you to read them three or four times before truly comprehending their full meaning.

Don't get me wrong.  This particular blogger is an incredibly gifted writer known for his absolutely brilliant blog posts.  My only true criticism of his writing style is it truly requires a significant time commitment.

Michelle Russell does a great job explaining how to write with a knife.  No, not literally.  Writing with a knife means writing for yourself, but editing for your readers.  Editing is the hardest part of writing, but also the most important. 

Blogging with brevity doesn't necessarily mean “write really short posts.”  Being concise simply means taking out anything that doesn't need to be included.  For example, you really didn't need to read the additional jokes and Shakespearean references included in the first draft of this post.

 

The Future of Brevity is Bright

Some predict the size limits of message service standards and status updates will be increased.  Others predict new social media platforms will be based on different paradigms.  Either way, innovation will eventually deliver an ability to be more verbose.

However, barring some major scientific breakthrough (or some major breakdown in the space-time continuum), there will still only be 24 hours in a day.  Therefore, no matter what happens, I am certain the future of brevity is bright.

Neither the world nor people in it are likely to slow down.  Our attention spans will remain short.  Our time management skills will remain vigilant.  We will communicate through the power of pith, brevity will remain the soul of both wit and social media, and hopefully, we will all “live long and prosper.”

 

Related Posts

The Mullet Blogging Manifesto

Collablogaunity

Podcast: Your Blog, Your Voice

Collablogaunity

The meteoric rise of the Internet coupled with social media has created an amazing medium that is enabling people who are separated by vast distances and disparate cultures to come together, communicate, and collaborate in ways few would have thought possible just a few decades ago.  Blogging, especially when effectively integrated with social networking, can be one of the most powerful aspects of social media.

The great advantage to blogging as a medium, as opposed to books, newspapers, magazines, and even presentations, is that blogging is not just about broadcasting a message. 

This is not to say that books, newspapers, and magazines aren't useful (they certainly can be) or that presentations lack an interactive component (they certainly should not).  I simply believe that, when done well, blogging better facilities effective communication by starting a conversation, encouraging collaboration, and fostering a true sense of community.

Mashing together the words collaboration, blog, and community, I use the term collablogaunity — which is pronounced “Call a Blog a Unity” — to describe how remarkable blogs do this remarkably well.

 

Conversation

Blogging is a conversation — with your readers. 

I love the sound of my own voice and I talk to myself all the time (even in public).  However, the two-way conversation that blogging provides via comments from my readers greatly improves the quality of my blog content —  because it helps me better appreciate the difference between what I know and what I only think I know.

Without comments, the conversation is only one way.  Engaging readers in dialogue and discussion allows some of your points to be made for you by those who take the time to comment as opposed to you just telling everyone how you see the world.

Blogging isn't about using the Internet as your own personal bullhorn for broadcasting your message.  In her wonderful book The Whuffie Factor, Tara Hunt explains that you really need to:

“Turn the bullhorn around: stop talking, start listening, and create continuous conversations.”

Respond to the comments you receive (but never feed the troll).  You don't have to respond immediately.  Sometimes, the conversation will go more smoothly without your involvement as your readers talk amongst themselves.  Other times, your response will help continue the conversation and encourage participation from others. 

Always demonstrate that feedback is both welcome and appreciated.  Make sure to never talk down to your readers (either in your blog post or your comment responses).  It is perfectly fine to disagree and debate, just don't denigrate.  

In a recent guest post on ProBlogger, Rob McPhillips explained: 

“If instead, you are all the time only seeking praise and approval from everyone, then there is nothing solid, consistent or certain about your blog and so ultimately it will never gather a sizeable core of die hard fans.  Only drive by readers who scan a post and never look back.” 

Collaboration

Blogging is a collaboration — with other bloggers.

While conversation is primarily between you and your readers, collaboration is primarily between you and other bloggers.  Although you may be inclined to view other bloggers as “the competition,” especially those within your own niche, this would be a mistake.  Yes, it is true that blogs are competing with each other for readers.  However, sustainable success is achieved through collaboration and friendly competition with your peers.

Brian Clark has explained in the past and continues to exemplify that strategic collaboration is the secret to 21st century success.  Clark has stated that if he had to reduce his recipe for success to just three ingredients, it would be content, copywriting, and collaboration.  And if he had to give up two of those, then he'd keep collaboration.

In their terrific book Trust Agents, Chris Brogan and Julien Smith explain that although people in most cultures view themselves as the central hero in their life's story, the reality is that you need to build an army because you can't do it all alone.

Collaboration between bloggers is mainly about networking and cross-promotion.  You should network with other bloggers, especially those within your own niche.  This can be accomplished a number of ways including e-mail introductions, Twitter direct messages (if the other blogger is following you), LinkedIn connection requests, or Facebook friend requests.

As with any networking, the most important thing is being genuine.  As Darren Rowse and Chris Garrett explained in their highly recommended ProBlogger book, when you network with other bloggers, keep it real, be specific, keep it brief without being rude, and explain why you are interested in connecting.  They rightfully emphasize the importance of that last point.

As we all know, although content may be king, marketing is queen.  Networking with other bloggers can help you get the word out about your brilliant blog and its penchant for publishing posts that everyone must read.  Adding other bloggers to your blogroll, linking to their posts when applicable to your content, and leaving meaningful comments on their posts are not only recommended best practices of netiquette, they are also just the right thing to do.

Too many bloggers have a selfish networking and marketing strategy.  They only promote their own content and then wonder why nobody reads their blog.  I am fond of referring to all social media as Social Karma.  Focus on helping other bloggers promote their content and they will likely be more willing to return the favor.  However, don't misunderstand this technique to be a pathetic peer pressure tactic in other words, I re-tweeted your blog post, why didn't you re-tweet my blog post?

One last point on collaboration is to set realistic expectations — for others and for yourself.  You should definitely try to help others when you can.  However, you simply can't help everyone.  Don't let people take advantage of your generosity. 

Politely, but firmly, say no when you need to say no.  Also extend the same courtesy to other people when they turn you down (or simply ignore you) when you try to connect with them or when you ask them for their help. 

Mean and selfish people definitely suck.  But let's face it, nobody's perfect — we all have bad days, we all occasionally say and do stupid things, and we all occasionally treat people worse than they deserve to be treated.  So don't be too hard on people when they disappoint you, because tomorrow it will probably be your turn to have a bad day.

 

Community

Blogging is a community service.

If you truly believe and actually practice the principles of both conversation and collaboration, then viewing blogging as a community service comes naturally.  You will truly be more interested in actually listening to what your readers have to say, and less interested in just broadcasting your message.  You will see your words as simply the catalyst that gets the conversation started, and when necessary, helps continue the discussion. 

You will see friends not foes when encountering your blogging peers.  You will help them celebrate their successes and quickly recover from their failures.  You will help others when you can and without worrying about what's in it for you.

As James Chartrand says, you will welcome people to your blog because you view blogging as a festival of people, a community strengthened by people, where everyone can speak up with great care and attention, sharing thoughts and views while openly accepting differing opinions.  Blogging is a community service providing a wealth of experience, thoughts and knowledge being shared by all sorts of participants.

In the closing keynote of this year's BlogWorld conference, Chris Brogan explained (from notes taken by David B. Thomas):

“Make it about them.  Stop looking at this as a cult of me. 

It has to be about your audience.  Turn them into a community. 

The difference between an audience and a community is the way you face the chairs. 

The difference between an audience and a community:

One will fall on its sword for you and the other will watch you fall.”

Collablogaunity

Pronounced: “Call a Blog a Unity”

There are literally millions of blogs on the Internet today.  Your blog (to quote Seth Godin) is “either remarkable or invisible.”

Remarkable blogs primarily do three things:

  1. Start conversations
  2. Encourage collaboration
  3. Foster a true sense of community

Remarkable blogs are collablogaunities.  Is your blog a collablogaunity?

 

Related Posts

The Mullet Blogging Manifesto

Brevity is the Soul of Social Media

Podcast: Your Blog, Your Voice