Commendable Comments (Part 3)

In a July 2008 blog post on Men with Pens (one of the Top 10 Blogs for Writers 2009), James Chartrand explained:

“Comment sections are communities strengthened by people.”

“Building a blog community creates a festival of people” where everyone can, as Chartrand explained, “speak up with great care and attention, sharing thoughts and views while openly accepting differing opinions.”

I agree with James (and not just because of his cool first name) – my goal for this blog is to foster an environment in which a diversity of viewpoints is freely shared without bias.  Everyone is invited to get involved in the discussion and have an opportunity to hear what others have to offer.  This blog's comment section has become a community strengthened by your contributions.

This is the third entry in my ongoing series celebrating my heroes – my readers.

 

Commendable Comments

On The Fragility of Knowledge, Andy Lunn commented:

“In my field of Software Development, you simply cannot rest and rely on what you know.  The technology you master today will almost certainly evolve over time and this can catch you out.  There's no point being an expert in something no one wants any more!  This is not always the case, but don't forget to come up for air and look around for what's changing.

I've lost count of the number of organizations I've seen who have stuck with a technology that was fresh 15 years ago and a huge stagnant pot of data, who are now scrambling to come up to speed with what their customers expect.  Throwing endless piles of cash at the problem, hoping to catch up.

What am I getting at?  The secret I've learned is to adapt.  This doesn't mean jump on every new fad immediately, but be aware of it.  Follow what's trending, where the collective thinking is heading and most importantly, what do your customers want?

I just wish more organizations would think like this and realize that the systems they create, the data they hold, and the customers they have are in a constant state of flux.  They are all projects that need care and attention.  All subject to change, there's no getting away from it, but small, well planned changes are a lot less painful, trust me.”

On DQ-Tip: “Data quality is primarily about context not accuracy...”, Stephen Simmonds commented:

“I have to agree with Rick about data quality being in the eye of the beholder – and with Henrik on the several dimensions of quality.

A theme I often return to is 'what does the business want/expect from data?' – and when you hear them talk about quality, it's not just an issue of accuracy.  The business stakeholder cares – more than many seem to notice – about a number of other issues that are squarely BI concerns:

– Timeliness ('WHEN I want it')
– Format ('how I want to SEE it') – visualization, delivery channels
– Usability ('how I want to then make USE of it') – being able to extract information from a report (say) for other purposes
– Relevance ('I want HIGHLIGHTED the information that is meaningful to me')

And so on.  Yes, accuracy is important, and it messes up your effectiveness when delivering inaccurate information.  But that's not the only thing a business stakeholder can raise when discussing issues of quality.  A report can be rejected as poor quality if it doesn't adequately meet business needs in a far more general sense.  That is the constant challenge for a BI professional.”

On Mistake Driven Learning, Ken O'Connor commented:

“There is a Chinese proverb that says:

'Tell me and I'll forget; Show me and I may remember; Involve me and I'll understand.'

I have found the above to be very true, especially when seeking to brief a large team on a new policy or process.  Interaction with the audience generates involvement and a better understanding.

The challenge facing books, whitepapers, blog posts etc. is that they usually 'Tell us,' they often 'Show us,' but they seldom 'Involve us.'

Hence, we struggle to remember, and struggle even more to understand.  We learn best by 'doing' and by making mistakes.”

You Are Awesome

Thank you very much for your comments.  For me, the best part of blogging is the dialogue and discussion provided by interactions with my readers.  Since there have been so many commendable comments, please don't be offended if your commendable comment hasn't been featured yet.  Please keep on commenting and stay tuned for future entries in the series.

By the way, even if you have never posted a comment on my blog, you are still awesome — feel free to tell everyone I said so.

 

Related Posts

Commendable Comments (Part 1)

Commendable Comments (Part 2)

Tweet 2001: A Social Media Odyssey

HAL 9000 “I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”

As I get closer and closer to my 2001st tweet on Twitter, I wanted to pause for some quiet reflection on my personal odyssey in social media – but then I decided to blog about it instead.

 

The Dawn of OCDQ

Except for LinkedIn, my epic drama of social media adventure and exploration started with my OCDQ blog.

In my Data Quality Pro article Blogging about Data Quality, I explained why I started this blog and discussed some of my thoughts on blogging.  Most importantly, I explained that I am neither a blogging expert nor a social media expert.

But now that I have been blogging and using social media for over six months, I feel more comfortable sharing my thoughts and personal experiences with social media without worrying about sounding like too much of an idiot (no promises, of course).

 

LinkedIn

My social media odyssey began in 2007 when I created my account on LinkedIn, which I admit, I initially viewed as just an online resume.  I put little effort into my profile, only made a few connections, and only joined a few groups.

Last year (motivated by the economic recession), I started using LinkedIn more extensively.  I updated my profile with a complete job history, asked my colleagues for recommendations, expanded my network with more connections, and joined more groups.  I also used LinkedIn applications (e.g. Reading List by Amazon and Blog Link) to further enhance my profile.

My favorite feature is the LinkedIn Groups, which not only provide an excellent opportunity to connect with other users, but also provide Discussions, News (including support for RSS feeds), and Job Postings.

By no means a comprehensive list, here are some LinkedIn Groups that you may be interested in:

For more information about LinkedIn features and benefits, check out the following posts on the LinkedIn Blog:

 

Twitter

Shortly after launching my blog in March 2009, I created my Twitter account to help promote my blog content.  In blogging, content is king, but marketing is queen.  LinkedIn (via group news feeds) is my leading source of blog visitors from social media, but Twitter isn't far behind. 

However, as Michele Goetz of Brain Vibe explained in her blog post Is Twitter an Effective Direct Marketing Tool?, Twitter has a click-through rate equivalent to direct mail.  Citing research from Pear Analytics, a “useful” tweet was found to have a shelf life of about one hour with about a 1% click-through rate on links.

In his blog post Is Twitter Killing Blogging?, Ajay Ohri of Decision Stats examined whether Twitter was a complement or a substitute for blogging.  I created a Data Quality on Twitter page on my blog in order to illustrate what I have found to be the complementary nature of tweeting and blogging. 

My ten blog posts receiving the most tweets (tracked using the Retweet Button from TweetMeme):

  1. The Nine Circles of Data Quality Hell 13 Tweets
  2. Adventures in Data Profiling (Part 1) 13 Tweets
  3. Fantasy League Data Quality 12 Tweets
  4. Not So Strange Case of Dr. Technology and Mr. Business 12 Tweets 
  5. The Fragility of Knowledge 11 Tweets
  6. The General Theory of Data Quality 9 Tweets
  7. The Very True Fear of False Positives8 Tweets
  8. Data Governance and Data Quality 8 Tweets
  9. Adventures in Data Profiling (Part 3)8 Tweets
  10. Data Quality: The Reality Show? 7 Tweets

Most of my social networking is done using Twitter (with LinkedIn being a close second).  I have also found Twitter to be great for doing research, which I complement with RSS subscriptions to blogs.

To search Twitter for data quality content:

If you are new to Twitter, then I would recommend reading the following blog posts:

 

Facebook

I also created my Facebook account shortly after launching my blog.  Although I almost exclusively use social media for professional purposes, I do use Facebook as a way to stay connected with family and friends. 

I created a page for my blog to separate my professional and personal aspects of Facebook without the need to manage multiple accounts.  Additionally, this allows you to become a “fan” of my blog without requiring you to also become my “friend.”

A quick note on Facebook games, polls, and triviaI do not play them.  With my obsessive-compulsive personality, I have to ignore them.  Therefore, please don't be offended if for example, I have ignored your invitation to play Mafia Wars.

By no means a comprehensive list, here are some Facebook Pages or Groups that you may be interested in:

 

Additional Social Media Websites

Although LinkedIn, Twitter, and Facebook are my primary social media websites, I also have accounts on three of the most popular social bookmarking websites: Digg, StumbleUpon, and Delicious.

Social bookmarking can be a great promotional tool that can help blog content go viral.  However, niche content is almost impossible to get to go viral.  Data quality is not just a niche – if technology blogging was a Matryoshka (a.k.a. Russian nested) doll, then data quality would be the last, innermost doll. 

This doesn't mean that data quality isn't an important subject – it just means that you will not see a blog post about data quality hitting the front pages of mainstream social bookmarking websites anytime soon.  Dylan Jones of Data Quality Pro created DQVote, which is a social bookmarking website dedicated to sharing data quality community content.

I also have an account on FriendFeed, which is an aggregator that can consolidate content from other social media websites, blogs or anything providing a RSS feed.  My blog posts and my updates from other social media websites (except for Facebook) are automatically aggregated.  On Facebook, my personal page displays my FriendFeed content.

 

Social Media Tools and Services

Social media tools and services that I personally use (listed in no particular order):

  • Flock The Social Web Browser Powered by Mozilla
  • TweetDeck Connecting you with your contacts across Twitter, Facebook, MySpace and more
  • Digsby – Digsby = Instant Messaging (IM) + E-mail + Social Networks
  • Ping.fm – Update all of your social networks at once
  • HootSuite – The professional Twitter client
  • Twitterfeed – Feed your blog to Twitter
  • Google FeedBurner – Provide an e-mail subscription to your blog
  • TweetMeme – Add a Retweet Button to your blog
  • Squarespace Blog Platform – The secret behind exceptional websites

 

Social Media Strategy

As Darren Rowse of ProBlogger explained in his blog post How I use Social Media in My Blogging, Chris Brogan developed a social media strategy using the metaphor of a Home Base with Outposts.

“A home base,” explains Rowse, “is a place online that you own.”  For example, your home base could be your blog or your company's website.  “Outposts,” continues Rowse, “are places that you have an online presence out in other parts of the web that you might not own.”  For example, your outposts could be your LinkedIn, Twitter, and Facebook accounts.

According to Rowse, your Outposts will make your Home Base stronger by providing:

“Relationships, ideas, traffic, resources, partnerships, community and much more.”

Social Karma

An effective social media strategy is essential for both companies and individual professionals.  Using social media can help promote you, your expertise, your company and your products and services.

However, too many companies and individuals have a selfish social media strategy.

You should not use social media exclusively for self-promotion.  You should view social media as Social Karma.

If you can focus on helping others when you use social media, then you will get much more back than just a blog reader, a LinkedIn connection, a Twitter follower, a Facebook friend, or even a potential customer.

Yes, I use social media to promote myself and my blog content.  However, more than anything else, I use social media to listen, to learn, and to help others when I can.

 

Please Share Your Social Media Odyssey

As always, I am interested in hearing from you.  What have been your personal experiences with social media?

Commendable Comments (Part 2)

In a recent guest post on ProBlogger, Josh Hanagarne “quoted” Jane Austen:

“It is a truth universally acknowledged, that a blogger in possession of a good domain must be in want of some worthwhile comments.”

“The most rewarding thing has been that comments,” explained Hanagarne, “led to me meeting some great people I possibly never would have known otherwise.”  I wholeheartedly echo that sentiment. 

This is the second entry in my ongoing series celebrating my heroes – my readers.

 

Commendable Comments

Proving that comments are the best part of blogging, on The Data-Information Continuum, Diane Neville commented:

“This article is intriguing. I would add more still.

A most significant quote:  'Data could be considered a constant while Information is a variable that redefines data for each specific use.'

This tells us that Information draws from a snapshot of a Data store.  I would state further that the very Information [specification] is – in itself – a snapshot.

The earlier quote continues:  'Data is not truly a constant since it is constantly changing.'

Similarly, it is a business reality that 'Information is not truly a constant since it is constantly changing.'

The article points out that 'The Data-Information Continuum' implies a many-to-many relationship between the two.  This is a sensible CONCEPTUAL model.

Enterprise Architecture is concerned as well with its responsibility for application quality in service to each Business Unit/Initiative.

For example, in the interest of quality design in Application Architecture, an additional LOGICAL model must be maintained between a then-current Information requirement and the particular Data (snapshots) from which it draws.  [Snapshot: generally understood as captured and frozen – and uneditable – at a particular point in time.]  Simply put, Information Snapshots have a PARENT RELATIONSHIP to the Data Snapshots from which they draw.

Analyzing this further, refer to this further piece of quoted wisdom (from section 'Subjective Information Quality'):  '...business units and initiatives must begin defining their Information...by using...Data...as a foundation...necessary for the day-to-day operation of each business unit and initiative.'

From logically-related snapshots of Information to the Data from which it draws, we can see from this quote that yet another PARENT/CHILD relationship exists...that from Business Unit/Initiative Snapshots to the Information Snapshots that implement whatever goals are the order of the day.  But days change.

If it is true that 'Data is not truly a constant since it is constantly changing,' and if we can agree that Information is not truly a constant either, then we can agree to take a rational and profitable leap to the truth that neither is a Business Unit/Initiative...since these undergo change as well, though they represent more slowly-changing dimensions.

Enterprises have an increasing responsibility for regulatory/compliance/archival systems that will qualitatively reproduce the ENTIRE snapshot of a particular operational transaction at any given point in time.

Thus, the Enterprise Architecture function has before it a daunting task:  to devise a holistic process that can SEAMLESSLY model the correct relationship of snapshots between Data (grandchild), Information (parent) and Business Unit/Initiative (grandparent).

There need be no conversion programs or redundant, throw-away data structures contrived to bridge the present gap.  The ability to capture the activities resulting from the undeniable point-in-time hierarchy among these entities is where tremendous opportunities lie.”

On Missed It By That Much, Vish Agashe commented:

“My favorite quote is 'Instead of focusing on the exceptions – focus on the improvements.'

I think that it is really important to define incremental goals for data quality projects and track the progress through percentage improvement over a period of time.

I think it is also important to manage the expectations that the goal is not necessarily to reach 100% (which will be extremely difficult if not impossible) clean data but the goal is to make progress to a point where the purpose for cleaning the data can be achieved in much better way than had the original data been used.

For example, if marketing wanted to use the contact data to create a campaign for those contacts which have a certain ERP system installed on-site.  But if the ERP information on the contact database is not clean (it is free text, in some cases it is absent etc...) then any campaign run on this data will reach only X% contacts at best (assuming only X% of contacts have ERP which is clean)...if the data quality project is undertaken to clean this data, one needs to look at progress in terms of % improvement.  How many contacts now have their ERP field cleaned and legible compared to when we started etc...and a reasonable goal needs to be set based on how much marketing and IT is willing to invest in these issues (which in turn could be based on ROI of the campaign based on increased outreach).”

Proving that my readers are way smarter than I am, on The General Theory of Data Quality, John O'Gorman commented:

“My theory of the data, information, knowledge continuum is more closely related to the element, compound, protein, structure arc.

In my world, there is no such thing as 'bad' data, just as there is no 'bad' elements.  Data is either useful or not: the larger the audience that agrees that a string is representative of something they can use, the more that string will be of value to me.

By dint of its existence in the world of human communication and in keeping with my theory, I can assign every piece of data to one of a fixed number of classes, each with characteristics of their own, just like elements in the periodic table.  And, just like the periodic table, those characteristics do not change.  The same 109 usable elements in the periodic table are found and are consistent throughout the universe, and our ability to understand that universe is based on that stability.

Information is simply data in a given context, like a molecule of carbon in flour.  The carbon retains all of its characteristics but the combination with other elements allows it to partake in a whole class of organic behavior. This is similar to the word 'practical' occurring in a sentence: Jim is a practical person or the letter 'p' in the last two words.

Where the analogue bends a bit is a cause of a lot of information management pain, but can be rectified with a slight change in perspective.  Computers (and almost all indexes) have a hard time with homographs: strings that are identical but that mean different things.  By creating fixed and persistent categories of data, my model suffers no such pain.

Take the word 'flies' in the following: 'Time flies like an arrow.' and 'Fruit flies like a pear.'  The data 'flies' can be permanently assigned to two different places, and their use determines which instance is relevant in the context of the sentence.  One instance is a verb, the other a plural noun.

Knowledge, in my opinion, is the ability to recognize, predict and synthesize patterns of information for past, present and future use, and more importantly to effectively communicate those patterns in one or more contexts to one or more audiences.

On one level, the model for information management that I use makes no apparent distinction between the data: we all use nouns, adjectives, verbs and sometimes scalar objects to communicate.  We may compress those into extremely compact concepts but they can all be unraveled to get at elemental components. At another level every distinction is made to insure precision.

The difference between information and knowledge is experiential and since experience is an accumulative construct, knowledge can be layered to appeal to common knowledge, special knowledge and unique knowledge.

Common being the most easily taught and widely applied; Special being related to one or more disciplines and/or special functions; and, Unique to individuals who have their own elevated understanding of the world and so have a need for compact and purpose-built semantic structures.

Going back to the analogue, knowledge is equivalent to the creation by certain proteins of cartilage, the use to which that cartilage is put throughout a body, and the specific shape of the cartilage that forms my nose as unique from the one on my wife's face.

To me, the most important part of the model is at the element level.  If I can convince a group of people to use a fixed set of elemental categories and to reference those categories when they create information, it's amazing how much tension disappears in the design, creation and deployment of knowledge.”

 

Tá mé buíoch díot

Daragh O Brien recently taught me the Irish Gaelic phrase Tá mé buíoch díot, which translates as I am grateful to you.

I am very grateful to all of my readers.  Since there have been so many commendable comments, please don't be offended if your commendable comment hasn't been featured yet.  Please keep on commenting and stay tuned for future entries in the series.

 

Related Posts

Commendable Comments (Part 1)

Commendable Comments (Part 3)

Commendable Comments (Part 1)

Six month ago today, I launched this blog by asking: Do you have obsessive-compulsive data quality (OCDQ)?

As of September 10, here are the monthly traffic statistics provided by my blog platform:

OCDQ Blog Traffic Overview

 

It Takes a Village (Idiot)

In my recent Data Quality Pro article Blogging about Data Quality, I explained why I started this blog.  Blogging provides me a way to demonstrate my expertise.  It is one thing for me to describe myself as an expert and another to back up that claim by allowing you to read my thoughts and decide for yourself.

In general, I have always enjoyed sharing my experiences and insights.  A great aspect to doing this via a blog (as opposed to only via whitepapers and presentations) is the dialogue and discussion provided via comments from my readers.

This two-way conversation not only greatly improves the quality of the blog content, but much more importantly, it helps me better appreciate the difference between what I know and what I only think I know. 

Even an expert's opinions are biased by the practical limits of their personal experience.  Having spent most of my career working with what is now mostly IBM technology, I sometimes have to pause and consider if some of that yummy Big Blue Kool-Aid is still swirling around in my head (since I “think with my gut,” I have to “drink with my head”).

Don't get me wrong – “You're my boy, Blue!” – but there are many other vendors and all of them also offer viable solutions driven by impressive technologies and proven methodologies.

Data quality isn't exactly the most exciting subject for a blog.  Data quality is not just a niche – if technology blogging was a Matryoshka (a.k.a. Russian nested) doll, then data quality would be the last, innermost doll. 

This doesn't mean that data quality isn't an important subject – it just means that you will not see a blog post about data quality hitting the front page of Digg anytime soon.

All blogging is more art than science.  My personal blogging style can perhaps best be described as mullet blogging – not “business in the front, party in the back” but “take your subject seriously, but still have a sense of humor about it.”

My blog uses a lot of metaphors and analogies (and sometimes just plain silliness) to try to make an important (but dull) subject more interesting.  Sometimes it works and sometimes it sucks.  However, I have never been afraid to look like an idiot.  After all, idiots are important members of society – they make everyone else look smart by comparison.

Therefore, I view my blog as a Data Quality Village.  And as the Blogger-in-Chief, I am the Village Idiot.

 

The Rich Stuff of Comments

Earlier this year in an excellent IT Business Edge article by Ann All, David Churbuck of Lenovo explained:

“You can host focus groups at great expense, you can run online surveys, you can do a lot of polling, but you won’t get the kind of rich stuff (you will get from blog comments).”

How very true.  But before we get to the rich stuff of our village, let's first take a look at a few more numbers:

  • Not counting this one, I have published 44 posts on this blog
  • Those blog posts have collectively received a total of 185 comments
  • Only 5 blog posts received no comments
  • 30 comments were actually me responding to my readers
  • 45 comments were from LinkedIn groups (23), SmartData Collective re-posts (17), or Twitter re-tweets (5)

The ten blog posts receiving the most comments:

  1. The Two Headed Monster of Data Matching 11 Comments
  2. Adventures in Data Profiling (Part 4)9 Comments
  3. Adventures in Data Profiling (Part 2) 9 Comments
  4. You're So Vain, You Probably Think Data Quality Is About You 8 Comments
  5. There are no Magic Beans for Data Quality 8 Comments
  6. The General Theory of Data Quality 8 Comments
  7. Adventures in Data Profiling (Part 1) 8 Comments
  8. To Parse or Not To Parse 7 Comments
  9. The Wisdom of Failure 7 Comments
  10. The Nine Circles of Data Quality Hell 7 Comments

 

Commendable Comments

This post will be the first in an ongoing series celebrating my heroes my readers.

As Darren Rowse and Chris Garrett explained in their highly recommended ProBlogger book: “even the most popular blogs tend to attract only about a 1 percent commenting rate.” 

Therefore, I am completely in awe of my blog's current 88 percent commenting rate.  Sure, I get my fair share of the simple and straightforward comments like “Great post!” or “You're an idiot!” but I decided to start this series because I am consistently amazed by the truly commendable comments that I regularly receive.

On The Data Quality Goldilocks Zone, Daragh O Brien commented:

“To take (or stretch) your analogy a little further, it is also important to remember that quality is ultimately defined by the consumers of the information.  For example, if you were working on a customer data set (or 'porridge' in Goldilocks terms) you might get it to a point where Marketing thinks it is 'just right' but your Compliance and Risk management people might think it is too hot and your Field Sales people might think it is too cold.  Declaring 'Mission Accomplished' when you have addressed the needs of just one stakeholder in the information can often be premature.

Also, one of the key learnings that we've captured in the IAIDQ over the past 5 years from meeting with practitioners and hosting our webinars is that, just like any Change Management effort, information quality change requires you to break the challenge into smaller deliverables so that you get regular delivery of 'just right' porridge to the various stakeholders rather than boiling the whole thing up together and leaving everyone with a bad taste in their mouths.  It also means you can more quickly see when you've reached the Goldilocks zone.”

On Data Quality Whitepapers are Worthless, Henrik Liliendahl Sørensen commented:

“Bashing in blogging must be carefully balanced.

As we all tend to find many things from gurus to tools in our own country, I have also found one of my favourite sayings from Søren Kirkegaard:

If One Is Truly to Succeed in Leading a Person to a Specific Place, One Must First and Foremost Take Care to Find Him Where He is and Begin There.

This is the secret in the entire art of helping.

Anyone who cannot do this is himself under a delusion if he thinks he is able to help someone else.  In order truly to help someone else, I must understand more than he–but certainly first and foremost understand what he understands.

If I do not do that, then my greater understanding does not help him at all.  If I nevertheless want to assert my greater understanding, then it is because I am vain or proud, then basically instead of benefiting him I really want to be admired by him.

But all true helping begins with a humbling.

The helper must first humble himself under the person he wants to help and thereby understand that to help is not to dominate but to serve, that to help is not to be the most dominating but the most patient, that to help is a willingness for the time being to put up with being in the wrong and not understanding what the other understands.”

On All I Really Need To Know About Data Quality I Learned In Kindergarten, Daniel Gent commented:

“In kindergarten we played 'Simon Says...'

I compare it as a way of following the requirements or business rules.

Simon says raise your hands.

Simon says touch your nose.

Touch your feet.

With that final statement you learned very quickly in kindergarten that you can be out of the game if you are not paying attention to what is being said.

Just like in data quality, to have good accurate data and to keep the business functioning properly you need to pay attention to what is being said, what the business rules are.

So when Simon says touch your nose, don't be touching your toes, and you'll stay in the game.”

Since there have been so many commendable comments, I could only list a few of them in the series debut.  Therefore, please don't be offended if your commendable comment didn't get featured in this post.  Please keep on commenting and stay tuned for future entries in the series.

 

Because of You

As Brian Clark of Copyblogger explains, The Two Most Important Words in Blogging are “You” and “Because.”

I wholeheartedly agree, but prefer to paraphrase it as: Blogging is “because of you.” 

Not you meaning me, the blogger you meaning you, the reader.

Thank You.

 

Related Posts

Commendable Comments (Part 2)

Commendable Comments (Part 3)


TDWI World Conference Chicago 2009

Founded in 1995, TDWI (The Data Warehousing Institute™) is the premier educational institute for business intelligence and data warehousing that provides education, training, certification, news, and research for executives and information technology professionals worldwide.  TDWI conferences always offer a variety of full-day and half-day courses taught in an objective, vendor-neutral manner.  The courses taught are designed for professionals and taught by in-the-trenches practitioners who are well known in the industry.

 

TDWI World Conference Chicago 2009 was held May 3-8 in Chicago, Illinois at the Hyatt Regency Hotel and was a tremendous success.  I attended as a Data Quality Journalist for the International Association for Information and Data Quality (IAIDQ).

I used Twitter to provide live reporting from the conference.  Here are my notes from the courses I attended: 

 

BI from Both Sides: Aligning Business and IT

Jill Dyché, CBIP, is a partner and co-founder of Baseline Consulting, a management and technology consulting firm that provides data integration and business analytics services.  Jill is responsible for delivering industry and client advisory services, is a frequent lecturer and writer on the business value of IT, and writes the excellent Inside the Biz blog.  She is the author of acclaimed books on the business value of information: e-Data: Turning Data Into Information With Data Warehousing and The CRM Handbook: A Business Guide to Customer Relationship Management.  Her latest book, written with Evan Levy, is Customer Data Integration: Reaching a Single Version of the Truth.

Course Quotes from Jill Dyché:

  • Five Critical Success Factors for Business Intelligence (BI):
    1. Organization - Build organizational structures and skills to foster a sustainable program
    2. Processes - Align both business and IT development processes that facilitate delivery of ongoing business value
    3. Technology - Select and build technologies that deploy information cost-effectively
    4. Strategy - Align information solutions to the company's strategic goals and objectives
    5. Information - Treat data as an asset by separating data management from technology implementation
  • Three Different Requirement Categories:
    1. What is the business need, pain, or problem?  What business questions do we need to answer?
    2. What data is necessary to answer those business questions?
    3. How do we need to use the resulting information to answer those business questions?
  • “Data warehouses are used to make business decisions based on data – so data quality is critical”
  • “Even companies with mature enterprise data warehouses still have data silos - each business area has its own data mart”
  • “Instead of pushing a business intelligence tool, just try to get people to start using data”
  • “Deliver a usable system that is valuable to the business and not just a big box full of data”

 

TDWI Data Governance Summit

Philip Russom is the Senior Manager of Research and Services at TDWI, where he oversees many of TDWI’s research-oriented publications, services, and events.  Prior to joining TDWI in 2005, he was an industry analyst covering BI at Forrester Research, as well as a contributing editor with Intelligent Enterprise and Information Management (formerly DM Review) magazines.

Summit Quotes from Philip Russom:

  • “Data Governance usually boils down to some form of control for data and its usage”
  • “Four Ps of Data Governance: People, Policies, Procedures, Process”
  • “Three Pillars of Data Governance: Compliance, Business Transformation, Business Integration”
  • “Two Foundations of Data Governance: Business Initiatives and Data Management Practices”
  • “Cross-functional collaboration is a requirement for successful Data Governance”

 

Becky Briggs, CBIP, CMQ/OE, is a Senior Manager and Data Steward for Airlines Reporting Corporation (ARC) and has 25 years of experience in data processing and IT - the last 9 in data warehousing and BI.  She leads the program team responsible for product, project, and quality management, business line performance management, and data governance/stewardship.

Summit Quotes from Becky Briggs:

  • “Data Governance is the act of managing the organization's data assets in a way that promotes business value, integrity, usability, security and consistency across the company”
  • Five Steps of Data Governance:
    1. Determine what data is required
    2. Evaluate potential data sources (internal and external)
    3. Perform data profiling and analysis on data sources
    4. Data Services - Definition, modeling, mapping, quality, integration, monitoring
    5. Data Stewardship - Classification, access requirements, archiving guidelines
  • “You must realize and accept that Data Governance is a program and not just a project”

 

Barbara Shelby is a Senior Software Engineer for IBM with over 25 years of experience holding positions of technical specialist, consultant, and line management.  Her global management and leadership positions encompassed network authentication, authorization application development, corporate business systems data architecture, and database development.

Summit Quotes from Barbara Shelby:

  • Four Common Barriers to Data Governance:
    1. Information - Existence of information silos and inconsistent data meanings
    2. Organization - Lack of end-to-end data ownership and organization cultural challenges
    3. Skill - Difficulty shifting resources from operational to transformational initiatives
    4. Technology - Business data locked in large applications and slow deployment of new technology
  • Four Key Decision Making Bodies for Data Governance:
    1. Enterprise Integration Team - Oversees the execution of CIO funded cross enterprise initiatives
    2. Integrated Enterprise Assessment - Responsible for the success of transformational initiatives
    3. Integrated Portfolio Management Team - Responsible for making ongoing business investment decisions
    4. Unit Architecture Review - Responsible for the IT architecture compliance of business unit solutions

 

Lee Doss is a Senior IT Architect for IBM with over 25 years of information technology experience.  He has a patent for process of aligning strategic capability for business transformation and he has held various positions including strategy, design, development, and customer support for IBM networking software products.

Summit Quotes from Lee Doss:

  • Five Data Governance Best Practices:
    1. Create a sense of urgency that the organization can rally around
    2. Start small, grow fast...pick a few visible areas to set an example
    3. Sunset legacy systems (application, data, tools) as new ones are deployed
    4. Recognize the importance of organization culture…this will make or break you
    5. Always, always, always – Listen to your customers

 

Kevin Kramer is a Senior Vice President and Director of Enterprise Sales for UMB Bank and is responsible for development of sales strategy, sales tool development, and implementation of enterprise-wide sales initiatives.

Summit Quotes from Kevin Kramer:

  • “Without Data Governance, multiple sources of customer information can produce multiple versions of the truth”
  • “Data Governance helps break down organizational silos and shares customer data as an enterprise asset”
  • “Data Governance provides a roadmap that translates into best practices throughout the entire enterprise”

 

Kanon Cozad is a Senior Vice President and Director of Application Development for UMB Bank and is responsible for overall technical architecture strategy and oversees information integration activities.

Summit Quotes from Kanon Cozad:

  • “Data Governance identifies business process priorities and then translates them into enabling technology”
  • “Data Governance provides direction and Data Stewardship puts direction into action”
  • “Data Stewardship identifies and prioritizes applications and data for consolidation and improvement”

 

Jill Dyché, CBIP, is a partner and co-founder of Baseline Consulting, a management and technology consulting firm that provides data integration and business analytics services.  (For Jill's complete bio, please see above).

Summit Quotes from Jill Dyché:

  • “The hard part of Data Governance is the data
  • “No data will be formally sanctioned unless it meets a business need”
  • “Data Governance focuses on policies and strategic alignment”
  • “Data Management focuses on translating defined polices into executable actions”
  • “Entrench Data Governance in the development environment”
  • “Everything is customer data – even product and financial data”

 

Data Quality Assessment - Practical Skills

Arkady Maydanchik is a co-founder of Data Quality Group, a recognized practitioner, author, and educator in the field of data quality and information integration.  Arkady's data quality methodology and breakthrough ARKISTRA technology were used to provide services to numerous organizations.  Arkady is the author of the excellent book Data Quality Assessment, a frequent speaker at various conferences and seminars, and a contributor to many journals and online publications.  Data quality curriculum by Arkady Maydanchik can be found at eLearningCurve.

Course Quotes from Arkady Maydanchik:

  • “Nothing is worse for data quality than desperately trying to fix it during the last few weeks of an ETL project”
  • “Quality of data after conversion is in direct correlation with the amount of knowledge about actual data”
  • “Data profiling tools do not do data profiling - it is done by data analysts using data profiling tools”
  • “Data Profiling does not answer any questions - it helps us ask meaningful questions”
  • “Data quality is measured by its fitness to the purpose of use – it's essential to understand how data is used”
  • “When data has multiple uses, there must be data quality rules for each specific use”
  • “Effective root cause analysis requires not stopping after the answer to your first question - Keep asking: Why?”
  • “The central product of a Data Quality Assessment is the Data Quality Scorecard”
  • “Data quality scores must be both meaningful to a specific data use and be actionable”
  • “Data quality scores must estimate both the cost of bad data and the ROI of data quality initiatives”

 

Modern Data Quality Techniques in Action - A Demonstration Using Human Resources Data

Gian Di Loreto formed Loreto Services and Technologies in 2004 from the client services division of Arkidata Corporation.  Loreto Services provides data cleansing and integration consulting services to Fortune 500 companies.  Gian is a classically trained scientist - he received his PhD in elementary particle physics from Michigan State University.

Course Quotes from Gian Di Loreto:

  • “Data Quality is rich with theory and concepts – however it is not an academic exercise, it has real business impact”
  • “To do data quality well, you must walk away from the computer and go talk with the people using the data”
  • “Undertaking a data quality initiative demands developing a deeper knowledge of the data and the business”
  • “Some essential data quality rules are ‘hidden’ and can only be discovered by ‘clicking around’ in the data”
  • “Data quality projects are not about systems working together - they are about people working together”
  • “Sometimes, data quality can be ‘good enough’ for source systems but not when integrated with other systems”
  • “Unfortunately, no one seems to care about bad data until they have it”
  • “Data quality projects are only successful when you understand the problem before trying to solve it”

 

Mark Your Calendar

TDWI World Conference San Diego 2009 - August 2-7, 2009.

TDWI World Conference Orlando 2009 - November 1-6, 2009.

TDWI World Conference Las Vegas 2010 - February 21-26, 2010.

Enterprise Data World 2009

Formerly known as the DAMA International Symposium and Wilshire MetaData Conference, Enterprise Data World 2009 was held April 5-9 in Tampa, Florida at the Tampa Convention Center.

 

Enterprise Data World is the business world’s most comprehensive vendor-neutral educational event about data and information management.  This year’s program was bigger than ever before, with more sessions, more case studies, and more can’t-miss content.  With 200 hours of in-depth tutorials, hands-on workshops, practical sessions and insightful keynotes, the conference was a tremendous success.  Congratulations and thanks to Tony Shaw, Maya Stosskopf and the entire Wilshire staff.

 

I attended Enterprise Data World 2009 as a member of the Iowa Chapter of DAMA and as a Data Quality Journalist for the International Association for Information and Data Quality (IAIDQ).

I used Twitter to provide live reporting from the sessions that I was attending.

I wish that I could have attended every session, but here are some highlights from ten of my favorites:

 

8 Ways Data is Changing Everything

Keynote by Stephen Baker from BusinessWeek

His article Math Will Rock Your World inspired his excellent book The Numerati.  Additionally, check out his blog: Blogspotting.

Quotes from the keynote:

  • "Data is changing how we understand ourselves and how we understand our world"
  • "Predictive data mining is about the mathematical modeling of humanity"
  • "Anthropologists are looking at social networking (e.g. Twitter, Facebook) to understand the science of friendship"

 

Master Data Management: Proven Architectures, Products and Best Practices

Tutorial by David Loshin from Knowledge Integrity.

Included material from his excellent book Master Data Management.  Additionally, check out his blog: David Loshin.

Quotes from the tutorial:

  • "Master Data are the core business objects used in the different applications across the organization, along with their associated metadata, attributes, definitions, roles, connections and taxonomies"
  • "Master Data Management (MDM) provides a unified view of core data subject areas (e.g. Customers, Products)"
  • "With MDM, it is important not to over-invest and under-implement - invest in and implement only what you need"

 

Master Data Management: Ignore the Hype and Keep the Focus on Data

Case Study by Tony Fisher from DataFlux and Jeff Grayson from Equinox Fitness.

Quotes from the case study:

  • "The most important thing about Master Data Management (MDM) is improving business processes"
  • "80% of any enterprise implementation should be the testing phase"
  • "MDM Data Quality (DQ) Challenge: Any % wrong means you’re 100% certain you’re not always right"
  • "MDM DQ Solution: Re-design applications to ensure the ‘front-door’ protects data quality"
  • "Technology is critical, however thinking through the operational processes is more important"

 

A Case of Usage: Working with Use Cases on Data-Centric Projects

Case Study by Susan Burk from IBM.

Quotes from the case study:

  • "Use Case is a sequence of actions performed to yield a result of observable business value"
  • "The primary focus of data-centric projects is data structure, data delivery and data quality"
  • "Don’t like use cases? – ok, call them business acceptance criteria – because that’s what a use case is"

 

Crowdsourcing: People are Smart, When Computers are Not

Session by Sharon Chiarella from Amazon Web Services.

Quotes from the session:

  • "Crowdsourcing is outsourcing a task typically performed by employees to a general community of people"
  • "Crowdsourcing eliminates over-staffing, lowers costs and reduces work turnaround time"
  • "An excellent example of crowdsourcing is open source software development (e.g. Linux)"

 

Improving Information Quality using Lean Six Sigma Methodology

Session by Atul Borkar and Guillermo Rueda from Intel.

Quotes from the session:

  • "Information Quality requires a structured methodology in order to be successful"
  • Lean Six Sigma Framework: DMAIC – Define, Measure, Analyze, Improve, Control:
    • Define = Describe the challenge, goal, process and customer requirements
    • Measure = Gather data about the challenge and the process
    • Analyze = Use hypothesis and data to find root causes
    • Improve = Develop, implement and refine solutions
    • Control = Plan for stability and measurement

 

Universal Data Quality: The Key to Deriving Business Value from Corporate Data

Session by Stefanos Damianakis from Netrics.

Quotes from the session:

  • "The information stored in databases is NEVER perfect, consistent and complete – and it never can be!"
  • "Gartner reports that 25% of critical data within large businesses is somehow inaccurate or incomplete"
  • "Gartner reports that 50% of implementations fail due to lack of attention to data quality issues"
  • "A powerful approach to data matching is the mathematical modeling of human decision making"
  • "The greatest advantage of mathematical modeling is that there are no data matching rules to build and maintain"

 

Defining a Balanced Scorecard for Data Management

Seminar by C. Lwanga Yonke, a founding member of the International Association for Information and Data Quality (IAIDQ).

Quotes from the seminar:

  • "Entering the same data multiple times is like paying the same invoice multiple times"
  • "Good metrics help start conversations and turn strategy into action"
  • Good metrics have the following characteristics:
    • Business Relevance
    • Clarity of Definition
    • Trending Capability (i.e. metric can be tracked over time)
    • Easy to aggregate and roll-up to a summary
    • Easy to drill-down to the details that comprised the measurement

 

Closing Panel: Data Management’s Next Big Thing!

Quotes from Panelist Peter Aiken from Data Blueprint:

  • Capability Maturity Levels:
    1. Initial
    2. Repeatable
    3. Defined
    4. Managed
    5. Optimized
  • "Most companies are at a capability maturity level of (1) Initial or (2) Repeatable"
  • "Data should be treated as a durable asset"

Quotes from Panelist Noreen Kendle from Burton Group:

  • "A new age for data and data management is on horizon – a perfect storm is coming"
  • "The perfect storm is being caused by massive data growth and software as a service (i.e. cloud computing)"
  • "Always remember that you can make lemonade from lemons – the bad in life can be turned into something good"

Quotes from Panelist Karen Lopez from InfoAdvisors:

  • "If you keep using the same recipe, then you keep getting the same results"
  • "Our biggest problem is not technical in nature - we simply need to share our knowledge"
  • "Don’t be a dinosaur! Adopt a ‘go with what is’ philosophy and embrace the future!"

Quotes from Panelist Eric Miller from Zepheira:

  • "Applications should not be ON The Web, but OF The Web"
  • "New Acronym: LED – Linked Enterprise Data"
  • "Semantic Web is the HTML of DATA"

Quotes from Panelist Daniel Moody from University of Twente:

  • "Unified Modeling Language (UML) was the last big thing in software engineering"
  • "The next big thing will be ArchiMate, which is a unified language for enterprise architecture modeling"

 

Mark Your Calendar

Enterprise Data World 2010 will take place in San Francisco, California at the Hilton San Francisco on March 14-18, 2010.