#FollowFriday and Re-Tweet-Worthiness

There is perhaps no better example of the peer pressure aspects of social networking than FollowFriday—the day when Twitter users recommend other users that you should follow (i.e., “I recommended you, why didn’t you recommend me?”).

However, every day of the week re-tweeting (the forwarding of another user’s Twitter status update, aka tweet) is performed.  Many bloggers (such as myself) use Twitter to promote their content by tweeting links to their new blog posts, and therefore, most re-tweets are attempts—made by the other members of the blogger’s collablogaunity—to help share meaningful content.

But I would be willing to wager that a considerable amount of re-tweeting is based on the act of reciprocity—and not based on evaluating the Re-Tweet-Worthiness of the shared content.  In other words, I believe that many people (myself included) sometimes don’t read what they re-tweet, but simply share content from a previously determined re-tweet-worthy source, or a source that they hope will reciprocate in the future (i.e., “I re-tweeted your blog post, why didn’t you re-tweet my blog post?”).

 

How do YOU determine Re-Tweet-Worthiness?

 

#FollowFriday Recommendations

By no means a comprehensive list, and listed in no particular order whatsoever, here are some great tweeps, and especially for truly re-tweet-worthy tweets about Data Quality, Data Governance, Master Data Management, and Business Intelligence:

 

PLEASE NOTE: No offense is intended to any of my tweeps not listed above.  However, if you feel that I have made a glaring omission of an obviously Twitterific Tweep, then please feel free to post a comment below and add them to the list.  Thanks!

I hope that everyone has a great FollowFriday and an even greater weekend.  See you all around the Twittersphere.

 

Related Posts

Data Quality and #FollowFriday the 13th

Twitter, Data Governance, and a #ButteredCat #FollowFriday

#FollowFriday and The Three Tweets

Dilbert, Data Quality, Rabbits, and #FollowFriday

Twitter, Meaningful Conversations, and #FollowFriday

The Fellowship of #FollowFriday

Demystifying Social Media

Social Karma

The Challenging Gift of Social Media

The Wisdom of the Social Media Crowd

The Good Data

Photo via Flickr (Creative Commons License) by: Philip Fibiger

When I was growing up, my family had a cabinet filled with “the good dishes” that were reserved for use on special occasions, i.e., the plates, bowls, and cups that would only be used for holiday dinners like Thanksgiving or Christmas.  The rest of the year, we used “the everyday dishes” that were a random collection of various sets of dishes collected over the years. 

Meals using the everyday dishes would seldom have matching plates, bowls, and cups, and if these dishes had a pattern on them once, it was mostly, if not completely, worn down by repeated use and constant washing.  Whenever we actually got to use the good dishes, it made the meal seem more special, more fancy, perhaps it even made the food seem like it tasted a little bit better.

Some organizations have a database filled with “the good data” that are reserved for special occasions.  In other words, the data prepared for specific business uses such as regulatory compliance and reporting.  Meanwhile, the rest of the time, and perhaps in support of daily operations, the organization uses “the everyday data” that is often a random collection of various data sets.

Business activities using the everyday data would seldom use a single source, but instead mash-up data from several sources, perhaps even storing the results in a spreadsheet or a private database—otherwise known by the more nefarious term: data silo.

Most of the time, when organizations discuss their enterprise data management strategy, they focus on building and maintaining the good data.  However, unlike the good dishes, the organization tries to force everyone to use the good data even for everyday business activities, and essentially force the organization to throw away the everyday data—to eliminate all those data silos.

But there is a time and a place for both the good dishes and the everyday dishes, as well as paper plates and plastic cups.  And yes, even eating with your hands has a time and a place, too.

The same is true for data.  Yes, you should build and maintain the good data to be used to support as many business activities as possible.  And yes, you should minimize the special occasions where customized data and/or data silos are truly necessary.

But you should also accept that since there is so much data available to the enterprise, and so many business uses for it, that forcing everyone to use only the good data might be preventing your organization from maximizing the full potential of its data.

 

Related Posts

To Our Data Perfectionists

DQ-View: From Data to Decision

The Data-Decision Symphony

Is your data complete and accurate, but useless to your business?

You Can’t Always Get the Data You Want

A Confederacy of Data Defects

One of my favorite novels is A Confederacy of Dunces by John Kennedy Toole.  The novel tells the tragicomic tale of Ignatius J. Reilly, described in the foreword by Walker Percy as a “slob extraordinary, a mad Oliver Hardy, a fat Don Quixote, and a perverse Thomas Aquinas rolled into one.”

The novel was written in the 1960s before the age of computer filing systems, so one of the jobs Ignatius has is working as a paper filing clerk in a clothing factory.  His employer is initially impressed with his job performance, since the disorderly mess of invoices and other paperwork slowly begin to disappear, resulting in the orderly appearance of a well organized and efficiently managed office space.

However, Ignatius is fired after he reveals the secret to his filing system—instead of filing the paperwork away into the appropriate file cabinets, he has simply been throwing all of the paperwork into the trash.

This scene reminds me of how data quality issues (aka data defects) are often perceived.  Many organizations acknowledge the importance of data quality, but don’t believe that data defects occur very often because the data made available to end users in dashboards and reports often passes through many processes that cleanse or otherwise sanitize the data before it reaches them.

ETL processes that extract source data for a data warehouse load will often perform basic data quality checks.  However, a fairly standard practice for “resolving” a data defect is to substitute a NULL value (e.g., a date stored in a text field in a source system that can not be converted into a valid date value is usually loaded into the target relational database with a NULL value).

When postal address validation software generates a valid mailing address, it often does so by removing what it considers to be “extraneous” information from the input address fields, which may include valid data accidentally entered into the wrong field, or that was lacking its own input field (e.g., e-mail address in an input address field deleted from the output valid mailing address).

And some reporting processes intentionally filter out “bad records” or eliminate “outlier values.”  This happens most frequently when preparing highly summarized reports, especially those intended for executive management.

These are just a few examples of common practices that can create the orderly appearance of a high quality data environment, but that conceal a confederacy of data defects about which the organization may remain blissfully (and dangerously) ignorant.

Do you suspect that your organization may be concealing A Confederacy of Data Defects?

Can Data Quality avoid the Dustbin of History?

After reading two blog posts about the 2011 predictions for data management by Steve Sarsfield and Henrik Liliendahl Sørensen, I was pondering writing a 2011 prediction post of my own—and then I read this recent Dilbert comic strip.

What if Dogbert is right and the only things that matter are social networks, games, and phones?  What implications does this have for the data management industry, and more specifically, the data quality profession?  How can data quality practitioners avoid being cast into the Dustbin of History in 2011 and beyond?

Perhaps we need to create a social network for data?  Let’s call it DataTweetBook.  Although we would be allowed to follow any data with a public profile, data would have to approve our friend requests—you know, in order to respect data’s privacy.

(Quick Side Bar Question: Do you think that your organization’s data would accept your friend request—or block you?)

Next, we would partner with Zynga and create DataVille and Data Quality Wars, which would be online games exclusive to the DataTweetBook platform.  These games would include fun challenges, like “consolidate duplicates in your contact database” and “design a user interface that prevents data quality issues from happening.”  You and your data can even ask other people and data in your social network for help with completing tasks, such as “ask postal reference data to validate your mailing addresses.”

Of course, we would then need to create iPhone and Android apps for DataTweetBook, DataVille, and Data Quality Wars, so that everyone can access the new social network and games on their mobile phones.  And eventually, we would start a bidding war between Apple and Google over the exclusive rights to make an integrated mobile device, either iDataPad or DataGoogler.

So that’s my 2011 prognostication for the data quality industry—it’s going be all about social networks, games, and phones.

 

Related Posts

Dilbert, Data Quality, Rabbits, and #FollowFriday

Comic Relief: Dilbert on Project Management

Comic Relief: Dilbert to the Rescue

‘Tis the Season for Data Quality

‘Tis the season for getting holiday greeting cards, and not only from family and friends, since many companies also like to mail seasons greetings to their customers, employees, and business partners.

I do appreciate the sentiment, but I mostly just check the envelopes for data quality issues with the name and/or postal address.

I have never made it through an entire holiday season without receiving at least one incorrectly addressed greeting card, and this year was no exception.  In the above image, I have highlighted that I apparently live in the town of Ankely, Pennsylvania.

I actually live in the town of Ankeny, Iowa.

The United States postal abbreviations for Pennsylvania and Iowa are PA and IA, respectively.  Additionally, the town name is only off by one character (L instead of N in the fifth position of a six character string).  Therefore, the data matching algorithms provided by most data quality tools would consider these relatively minor discrepancies to be highly probable matches.

And although Pennsylvania and Iowa are approximately 900 miles away from each other, since my street address and ZIP code (both intentionally blurred out in the image) were correct, the post office was able to successfully deliver the greeting card to me.

However, the really funny thing is that this greeting card was sent to me by a . . . (wait for it) . . . data quality tool vendor!

So apparently ‘tis the season for data quality . . . data quality issues, that is :-)

 

‘Tis the Season for Sharing Data Quality Stories

Have you encountered any seasonal data quality issues?  If so, please share your story by posting a comment below.

Does your organization have a Calumet Culture?

In my previous post, I once again blogged about how the key to success for most, if not all, organizational initiatives is the willingness of people all across the enterprise to embrace collaboration.

However, what happens when an organization’s corporate culture doesn’t foster an environment of collaboration?

Sometimes as a result of rapid business growth, an organization trades effectiveness for efficiency, prioritizes short-term tactics over long-term strategy, and even encourages “friendly” competition amongst its relatively autonomous business units.

However, when the need for a true enterprise-wide initiative such as data governance becomes (perhaps painfully) obvious, the organization decides to bring representatives from all of its different “tribes” together to discuss the complexities of the business, data, technical, and (most important) people related issues that would shape the realities of a truly collaborative environment.

“Calumet Culture” is the term I like using (and not just because of my affinity for alliteration) to describe the disingenuous way that I have occasionally witnessed these organizational stakeholder gathering “ceremonies” carried out.

Calumet was the Norman word used by Norman-French Canadian settlers to describe the “peace pipes” they witnessed the people of the First Nations (referred to as Native Americans in the United States) using at ceremonies marking a treaty between previously combative factions.

Simply gathering everyone together around the camp fire (or the conference room table) is an empty gesture, similar in many ways to non-Native Americans mimicking a “peace pipe ceremony” and using one of their words (Calumet) to describe what was in fact a deeply spiritual object used to convey true significance to the event.

When collaboration is discussed at strategic planning meetings with great pomp and circumstance, but after the meetings end, the organization returns to its non-collaborative status quo, then little, if any, true collaboration should be expected to happen.

Does your organization have a Calumet Culture?

In other words, does your organization have a corporate culture that talks the talk of collaboration, but doesn’t walk the walk?

If so, how have you attempted to overcome this common barrier to success?

Data Governance and the Social Enterprise

In his blog post Socializing Software, Michael Fauscette explained that in order “to create a next generation enterprise, businesses need to take two concepts from the social web and apply them across all business functions: community and content.”

“Traditional enterprise software,” according to Fauscette, “was built on the concept of managing through rigid business processes and controlled workflow.  With process at the center of the design, people-based collaboration was not possible.”

Peter Sondergaard, the global head of research at Gartner, explained at a recent conference that “the rigid business processes which dominate enterprise organizational architectures today are well suited for routine, predictable business activities.  But they are poorly suited to support people who’s jobs require discovery, interpretation, negotiation and complex decision-making.”

“Social computing,” according to Sondergaard, “not Facebook, or Twitter, or LinkedIn, but the technologies and principals behind them will be implemented across and between all organizations, and it will unleash yet to be realized productivity growth.”

Since the importance of collaboration is one of my favorite topics, I like Fauscette’s emphasis on people-based collaboration and Sondergaard’s emphasis on the limitations of process-based collaboration.  The key to success for most, if not all, organizational initiatives is the willingness of people all across the enterprise to embrace collaboration.

Successful organizations view collaboration not just as a guiding principle, but as a call to action in their daily business practices.

As Sondergaard points out, the technologies and principals behind social computing are the key to enabling what many analysts have begun referring to as the social enterprise.  Collaboration is the key to business success.  This essential collaboration has to be based on people, and not on rigid business processes since business activities and business priorities are constantly changing.

 

Data Governance and the Social Enterprise

Often the root cause of poor data quality can be traced to a lack of a shared understanding of the roles and responsibilities involved in how the organization is using its data to support its business activities.  The primary focus of data governance is the strategic alignment of people throughout the organization through the definition, implementation, and enforcement of the policies that govern the interactions between people, business processes, data, and technology.

A data quality program within a data governance framework is a cross-functional, enterprise-wide initiative requiring people to be accountable for its data, business process, and technology aspects.  However, policy enforcement and accountability are often confused with traditional notions of command and control, which is the antithesis of the social enterprise that instead requires an emphasis on communication, cooperation, and people-based collaboration.

Data governance policies for data quality illustrate the intersection of business, data, and technical knowledge, which is spread throughout the enterprise, transcending any artificial boundaries imposed by an organizational chart or rigid business processes, where different departments or different business functions appear as if they were independent of the rest of the organization.

Data governance reveals how interconnected and interdependent the organization is, and why people-driven social enterprises are more likely to survive and thrive in today’s highly competitive and rapidly evolving marketplace.

Social enterprises rely on the strength of their people asset to successfully manage their data, which is a strategic corporate asset because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to optimize business processes for superior business performance.

 

Related Posts

Podcast: Data Governance is Mission Possible

Trust is not a checklist

The Business versus IT—Tear down this wall!

The Road of Collaboration

Shared Responsibility

Enterprise Ubuntu

Data Transcendentalism

Social Karma

Commendable Comments (Part 8)

This Thursday is Thanksgiving Day, which is a United States holiday with a long and varied history.  The most consistent themes remain family and friends gathering together to share a large meal and express their gratitude.

This is the eighth entry in my ongoing series for expressing my gratitude to my readers for their truly commendable comments on my blog posts.  Receiving comments is the most rewarding aspect of my blogging experience.  Although I am truly grateful to all of my readers, I am most grateful to my commenting readers.

 

Commendable Comments

On The Data-Decision Symphony, James Standen commented:

“Being a lover of both music and data, it struck all the right notes!

I think the analogy is a very good one—when I think about data as music, I think about a companies business intelligence architecture as being a bit like a very good concert hall, stage, and instruments. All very lovely to listen to music—but without the score itself (the data), there is nothing to play.

And while certainly a real live concert hall is fantastic for enjoying Bach, I’m enjoying some Bach right now on my laptop—and the MUSIC is really the key.

Companies very often focus on building fantastic concert halls (made with all the best and biggest data warehouse appliances, ETL servers, web servers, visualization tools, portals, etc.) but forget that the point was to make that decision—and base it on data from the real world. Focusing on the quality of your data, and on the decision at hand, can often let you make wonderful music—and if your budget or schedule doesn't allow for a concert hall, you might be able to get there regardless.”

On “Some is not a number and soon is not a time”, Dylan Jones commented:

“I used to get incredibly frustrated with the data denial aspect of our profession.  Having delivered countless data quality assessments, I’ve never found an organization that did not have pockets of extremely poor data quality, but as you say, at the outset, no-one wants to believe this.

Like you, I’ve seen the natural defense mechanisms.  Some managers do fear the fallout and I’ve even had quite senior directors bury our research and quickly cut any further activity when issues have been discovered, fortunately that was an isolated case.

In the majority of cases though I think that many senior figures are genuinely shocked when they see their data quality assessments for the first time.  I think the big problem is that because they institutionalize so many scrap and rework processes and people that are common to every organization, the majority of issues are actually hidden.

This is one of the issues I have with the big shock announcements we often see in conference presentations (I’m as guilty as hell for these so call me a hypocrite) where one single error wipes millions off a share price or sends a space craft hurtling into Mars. 

Most managers don’t experience this cataclysm, so it’s hard for them to relate to because it implies their data needs to be perfect, they believe that’s unattainable and lose interest.

Far better to use anecdotes like the one cited in this blog to demonstrate how simple improvements can change lives and the bottom line in a limited time span.”

On The Real Data Value is Business Insight, Winston Chen commented:

“Yes, quality is in the eye of the beholder.  Data quality metrics must be calculated within the context of a data consumer.  This context is missing in most software tools on the market.

Another important metric is what I call the Materiality Metric.

In your example, 50% of customer data is inaccurate.  It’d be helpful if we know which 50%.  Are they the customers that generate the most revenue and profits, or are they dormant customers?  Are they test records that were never purged from the system?  We can calculate the materiality metric by aggregating a relevant business metric for those bad records.

For example, 85% of the year-to-date revenue is associated with those 50% bad customer records.

Now we know this is serious!”

On The Real Data Value is Business Insight, James Taylor commented:

“I am constantly amazed at the number of folks I meet who are paralyzed about advanced analytics, saying that ‘we have to fix/clean/integrate all our data before we can do that.’

They don’t know if the data would even be relevant, haven’t considered getting the data from an external source and haven't checked to see if the analytic techniques being considered could handle the bad or incomplete data automatically!  Lots of techniques used in data mining were invented when data was hard to come by and very ‘dirty’ so they are actually pretty good at coping.  Unless someone thinks about the decision you want to improve, and the analytics they will need to do so, I don’t see how they can say their data is too dirty, too inconsistent to be used.”

On The Business versus IT—Tear down this wall!, Scott Andrews commented:

“Early in my career, I answered a typical job interview question ‘What are your strengths?’ with:

‘I can bring Business and IT together to deliver results.’

My interviewer wryly poo-poo’d my answer with ‘Business and IT work together well already,’ insinuating that such barriers may have existed in the past, but were now long gone.  I didn’t get that particular job, but in the years since I have seen this barrier in action (I can attest that my interviewer was wrong).

What is required for Business Intelligence success is to have smart business people and smart IT people working together collaboratively.  Too many times one side or the other says ‘that’s not my job’ and enormous potential is left unrealized.”

On The Business versus IT—Tear down this wall!, Jill Wanless commented:

“It amazes me (ok, not really...it makes me cynical and want to rant...) how often Business and IT SAY they are collaborating, but it’s obvious they have varying views and perspectives on what collaboration is and what the expected outcomes should be.  Business may think collaboration means working together for a solution, IT may think it means IT does the dirty work so Business doesn’t have to.

Either way, why don’t they just start the whole process by having a (honest and open) chat about expectations and that INCLUDES what collaboration means and how they will work together.

And hopefully, (here’s where I start to rant because OMG it’s Collaboration 101) that includes agreement not to use language such as BUSINESS and IT, but rather start to use language like WE.”

On Delivering Data Happiness, Teresa Cottam commented:

“Just a couple of days ago I had this conversation about the curse of IT in general:

When it works no-one notices or gives credit; it’s only when it’s broken we hear about it.

A typical example is government IT over here in the UK.  Some projects have worked well; others have been spectacular failures.  Guess which we hear about?  We review failure mercilessly but sometimes forget to do the same with success so we can document and repeat the good stuff too!

I find the best case studies are the balanced ones that say: this is what we wanted to do, this is how we did it, these are the benefits.  Plus this is what I’d do differently next time (lessons learned).

Maybe in those lessons learned we should also make a big effort to document the positive learnings and not just take these for granted.  Yes these do come out in ‘best practices’ but again, best practices never get the profile of disaster stories...

I wonder if much of the gloom is self-fulfilling almost, and therefore quite unhealthy.  So we say it’s difficult, the failure rate is high, etc. – commonly known as covering your butt.  Then when something goes wrong you can point back to the low expectations you created in the first place.

But maybe, the fact we have low expectations means we don’t go in with the right attitude?

The self-defeating outcome is that many large organizations are fearful of getting to grips with their data problems.  So lots of projects we should be doing to improve things are put on hold because of the perceived risk, disruption, cost – things then just get worse making the problem harder to resolve.

Data quality professionals surely dont want to be seen as effectively undertakers to the doomed project, necessary yes, but not surrounded by the unmistakable smell of death that makes others uncomfortable.

Sure the nature of your work is often to focus on the broken, but quite apart from anything else, isn’t it always better to be cheerful?”

On Why isn’t our data quality worse?, Gordon Hamilton commented:

“They say that sport coaches never teach the negative, or to double the double negative, they never say ‘don’t do that.’  I read somewhere, maybe Daniel Siegel’s stuff, that when the human brain processes the statement ‘don’t do that’ it drops the ‘don’t,’ which leaves it thinking ‘do that.’

Data quality is a complex and multi-splendiforous area with many variables intermingled, but our task as Data Quality Evangelists would be more pleasant if we were helping people rise to the level of the positive expectations, rather than our being codependent in their sinking to the level of the negative expectation.”

DQ-Tip: “There is no such thing as data accuracy...” sparked an excellent debate between Graham Rhind and Peter Benson, who is the Project Leader of ISO 8000, which is the international standards for data quality.  Their debate included the differences and interdependencies that exist between data and information, as well as between data quality and information quality.

 

Thanks for giving your comments

Thank you very much for giving your comments and sharing your perspectives with our collablogaunity.

This entry in the series highlighted commendable comments on OCDQ Blog posts published in August and September of 2010.

Since there have been so many commendable comments, please don’t be offended if one of your comments wasn’t featured.

Please keep on commenting and stay tuned for future entries in the series.

 

Related Posts

Commendable Comments (Part 7)

Commendable Comments (Part 6)

Commendable Comments (Part 5)

Commendable Comments (Part 4)

Commendable Comments (Part 3)

Commendable Comments (Part 2)

Commendable Comments (Part 1)

The Data Outhouse

This is a screen capture of the results of last week’s unscientific data quality poll where it was noted that in many organizations a data warehouse is the only system where data from numerous and disparate operational sources has been integrated into a single system of record containing fully integrated and historical data.  Although the rallying cry and promise of the data warehouse has long been that it will serve as the source for most of the enterprise’s reporting and decision support needs, many simply get ignored by the organization, which continues to rely on its data silos and spreadsheets for reporting and decision making.

Based on my personal experience, the most common reason is that these big boxes of data are often built with little focus on the quality of the data being delivered.  However, since that’s just my opinion, I launched the poll and invited your comments.

 

Commendable Comments

Stephen Putman commented that data warehousing “projects are usually so large that if you approach them in a big-bang, OLTP management fashion, the foundational requirements of the thing change between inception and delivery.”

“I’ve seen very few data warehouses live up to the dream,” Dylan Jones commented.  “I’ve always found that silos still persisted after a warehouse introduction because the turnaround on adding new dimensions and reports to the warehouse/mart meant that the business users simply had no option.  I think data quality obviously plays a part.  The business side only need to be burnt once or twice before they lose faith.  That said, a data warehouse is one of the best enablers of data quality motivation, so without them a lot of projects simply wouldn’t get off the ground.”

“I just voted Outhouse too,” commented Paul Drenth, “because I agree with Dylan that the business side keeps using other systems out of disappointment in the trustworthiness of the data warehouse.  I agree that bad data quality plays a role in that, but more often it’s also a lack of discipline in the organization which causes a downward spiral of missing information, and thus deciding to keep other information in a separate or local system.  So I think usability of data warehouse systems still needs to be improved significantly, also by adding invisible or automatic data quality assurance, the business might gain more trust.”

“Great point Paul, useful addition,” Dylan responded.  “I think discipline is a really important aspect, this ties in with change management.  A lot of business people simply don’t see the sense of urgency for moving their reports to a warehouse so lack the discipline to follow the procedures.  Or we make the procedures too inflexible.  On one site I noticed that whenever the business wanted to add a new dimension or category it would take a 2-3 week turnaround to sign off.  For a financial services company this was a killer because they had simply been used to dragging another column into their Excel spreadsheets, instantly getting the data they needed.  If we’re getting into information quality for a second, then the dimension of presentation quality and accessibility become far more important than things like accuracy and completeness.  Sure a warehouse may be able to show you data going back 15 years and cross validates results with surrogate sources to confirm accuracy, but if the business can’t get it in a format they need, then it’s all irrelevant.”

“I voted Data Warehouse,” commented Jarrett Goldfedder, “but this is marked with an asterisk.  I would say that 99% of the time, a data warehouse becomes an outhouse, crammed with data that serves no purpose.  I think terminology is important here, though.  In my previous organization, we called the Data Warehouse the graveyard and the people who did the analytics were the morticians.  And actually, that’s not too much of a stretch considering our job was to do CSI-type investigations and autopsies on records that didn’t fit with the upstream information.  This did not happen often, but when it did, we were quite grateful for having historical records maintained.  IMHO, if the records can trace back to the existing data and will save the organization money in the long-run, then the warehouse has served its purpose.”

“I’m having a difficult time deciding,” Corinna Martinez commented, “since most of the ones I have seen are high quality data, but not enough of it and therefore are considered Data Outhouses.  You may want to include some variation in your survey that covers good data but not enough; and bad data but lots to shift through in order to find something.”

“I too have voted Outhouse,” Simon Daniels commented, “and have also seen beautifully designed, PhD-worthy data warehouse implementations that are fundamentally of no practical use.  Part of the reason for this I think, particularly from a marketing point-of-view, which is my angle, is that how the data will be used is not sufficiently thought through.  In seeking to create marketing selections, segmentation and analytics, how will the insight locked-up in the warehouse be accessed within the context of campaign execution and subsequent response analysis?  Often sitting in splendid isolation, the data warehouse doesn’t offer the accessibility needed in day-to-day activities.”

Thanks to everyone who voted and special thanks to everyone who commented.  As always, your feedback is greatly appreciated.

 

Can MDM and Data Governance save the Data Warehouse?

During last week’s Informatica MDM Tweet Jam, Dan Power explained that master data management (MDM) can deliver to the business “a golden copy of the data that they can trust” and I remarked how companies expected that from their data warehouse.

“Most companies had unrealistic expectations from data warehouses,” Power responded, “which ended up being expensive, read-only, and updated infrequently.  MDM gives them the capability to modify the data, publish to a data warehouse, and manage complex hierarchies.  I think MDM offers more flexibility than the typical data warehouse.  That’s why business intelligence (BI) on top of MDM (or more likely, BI on top of a data warehouse that draws data from MDM) is so popular.”

As a follow-up question, I asked if MDM should be viewed as a complement or a replacement for the data warehouse.  “Definitely a complement,” Power responded. “MDM fills a void in the middle between transactional systems and the data warehouse, and does things that neither can do to data.”

In his recent blog post How to Keep the Enterprise Data Warehouse Relevant, Winston Chen explains that the data quality deficiencies of most data warehouses could be aided by MDM and data governance, which “can define and enforce data policies for quality across the data landscape.”  Chen believes that the data warehouse “is in a great position to be the poster child for data governance, and in doing so, it can keep its status as the center of gravity for all things data in an enterprise.”

I agree with Power that MDM can complement the data warehouse, and I agree with Chen that data governance can make the data warehouse (as well as many other things) better.  So perhaps MDM and data governance can save the data warehouse.

However, I must admit that I remain somewhat skeptical.  The same challenges that have caused most data warehouses to become data outhouses are also fundamental threats to the success of MDM and data governance.

 

Thinking outside the house

Just like real outhouses were eventually obsolesced by indoor plumbing, I wonder if data outhouses will eventually be obsolesced, perhaps ironically by emerging trends of outdoor plumbing, i.e., open source, cloud computing, and software as a service (SaaS).

Many industry analysts are also advocating the evolution of data as a service (DaaS), where data is taken out of all of its houses, meaning that the answer to my poll question might be neither data warehouse nor data outhouse.

Although none of these trends obviate the need for data quality nor alleviate the other significant challenges mentioned above, perhaps when it comes to data, we need to start thinking outside the house.

 

Related Posts

DQ-Poll: Data Warehouse or Data Outhouse?

Podcast: Data Governance is Mission Possible

Once Upon a Time in the Data

The Idea of Order in Data

Fantasy League Data Quality

Which came first, the Data Quality Tool or the Business Need?

Finding Data Quality

The Circle of Quality

TDWI World Conference Orlando 2010

Last week I attended the TDWI World Conference held November 7-12 in Orlando, Florida at the Loews Royal Pacific Resort.

As always, TDWI conferences offer a variety of full-day and half-day courses taught in an objective, vendor-neutral manner, designed for professionals and taught by in-the-trenches practitioners who are well known in the industry.

In this blog post, I summarize a few key points from two of the courses I attended.  I used Twitter to help me collect my notes, and you can access the complete archive of my conference tweets on Twapper Keeper.

 

A Practical Guide to Analytics

Wayne Eckerson, author of the book Performance Dashboards: Measuring, Monitoring, and Managing Your Business, described the four waves of business intelligence:

  1. Reporting – What happened?
  2. Analysis – Why did it happen?
  3. Monitoring – What’s happening?
  4. Prediction – What will happen?

“Reporting is the jumping off point for analytics,” explained Eckerson, “but many executives don’t realize this.  The most powerful aspect of analytics is testing our assumptions.”  He went on to differentiate the two strains of analytics:

  1. Exploration and Analysis – Top-down and deductive, primarily uses query tools
  2. Prediction and Optimization – Bottom-up and inductive, primarily uses data mining tools

“A huge issue for predictive analytics is getting people to trust the predictions,” remarked Eckerson.  “Technology is the easy part, the hard part is selling the business benefits and overcoming cultural resistance within the organization.”

“The key is not getting the right answers, but asking the right questions,” he explained, quoting Ken Rudin of Zynga.

“Deriving insight from its unique information will always be a competitive advantage for every organization.”  He recommended the book Competing on Analytics: The New Science of Winning as a great resource for selling the business benefits of analytics.

 

Data Governance for BI Professionals

Jill Dyché, a partner and co-founder of Baseline Consulting, explained that data governance transcends business intelligence and other enterprise information initiatives such as data warehousing, master data management, and data quality.

“Data governance is the organizing framework,” explained Dyché, “for establishing strategy, objectives, and policies for corporate data.  Data governance is the business-driven policy making and oversight of corporate information.”

“Data governance is necessary,” remarked Dyché, “whenever multiple business units are sharing common, reusable data.”

“Data governance aligns data quality with business measures and acceptance, positions enterprise data issues as cross-functional, and ensures data is managed separately from its applications, thereby evolving data as a service (DaaS).”

In her excellent 2007 article Serving the Greater Good: Why Data Hoarding Impedes Corporate Growth, Dyché explained the need for “systemizing the notion that data – corporate asset that it is – belongs to everyone.”

“Data governance provides the decision rights around the corporate data asset.”

 

Related Posts

DQ-View: From Data to Decision

Podcast: Data Governance is Mission Possible

The Business versus IT—Tear down this wall!

MacGyver: Data Governance and Duct Tape

Live-Tweeting: Data Governance

Enterprise Data World 2010

Enterprise Data World 2009

TDWI World Conference Chicago 2009

Light Bulb Moments at DataFlux IDEAS 2010

DataFlux IDEAS 2009

DQ-Poll: Data Warehouse or Data Outhouse?

In many organizations, a data warehouse is the only system where data from numerous and disparate operational sources has been integrated into a single repository of enterprise data.

The rapid delivery of a single system of record containing fully integrated and historical data to be used as the source for most of the enterprise’s reporting and decision support needs has long been the rallying cry and promise of the data warehouse.

However, I have witnessed beautifully architected, elegantly implemented, and diligently maintained data warehouses simply get ignored by the organization, which continues to rely on its data silos and spreadsheets for reporting and decision making.

The most common reason is that these big boxes of data are often built with little focus on the quality of the data being delivered.

But that’s just my opinion based on my personal experience.  So let’s conduct an unscientific poll.

 

Additionally, please feel free to post a comment below and explain your vote or simply share your opinions and experiences.

DQ-View: From Data to Decision

Data Quality (DQ) View is an OCDQ regular segment.  Each DQ-View is a brief video discussion of a data quality key concept.

As I posited in The Circle of Quality, an organization’s success is measured by its business results, which are dependent on the quality of its business decisions, which rely on the quality of its data.  In this new DQ-View segment, I want to briefly discuss the relationship between data quality and decision quality and examine a few crucial aspects of the journey from data to decision.

 

DQ-View: From Data to Decision

 

If you are having trouble viewing this video, then you can watch it on Vimeo by clicking on this link: DQ-View on Vimeo

 

Related Posts

The Business versus IT—Tear down this wall!

The Data-Decision Symphony

The Real Data Value is Business Insight

Scrum Screwed Up

Is your data complete and accurate, but useless to your business?

Finding Data Quality

Fantasy League Data Quality

TDWI World Conference Chicago 2009

 

Additional OCDQ Video Posts

DQ View: Achieving Data Quality Happiness

Video: Oh, the Data You’ll Show!

Data Quality is not a Magic Trick

DQ-View: The Cassandra Effect

DQ-View: Is Data Quality the Sun?

DQ-View: Designated Asker of Stupid Questions

Social Karma (Part 8)

Will people still read in the future?

Podcast: Data Governance is Mission Possible

The recent Information Management article Data – Who Cares! by Martin ABC Hansen of Platon has the provocative subtitle:

“If the need to care for data and manage it as an asset is so obvious, then why isn’t it happening?”

Hansen goes on to explain some of the possible reasons under an equally provocative section titled “Mission Impossible.”  It is a really good article that I recommend reading, and it also prompted me to record my thoughts on the subject in a new podcast:

You can also download this podcast (MP3 file) by clicking on this link: Data Governance is Mission Possible

Some of the key points covered in this approximately 15 minute OCDQ Podcast include:

  • Data is a strategic corporate asset because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to make better business decisions and optimize business performance
  • Data is an asset owned by the entire enterprise, and not owned by individual business units nor individual people
  • Data governance is the strategic alignment of people throughout the organization through the definition and enforcement of the declared policies that govern the complex ways in which people, business processes, data, and technology interact
  • Five steps for enforcing data governance policies:
    1. Documentation Use straightforward, natural language to document your policies in a way everyone can understand
    2. Communication Effective communication requires that you encourage open discussion and debate of all viewpoints
    3. Metrics Meaningful metrics can be effectively measured, and represent the business impact of data governance
    4. Remediation Correct any combination of business process, technology, data, and people—and sometimes all four
    5. Refinement Dynamically evolve and adapt your data governance policies—as well as their associated metrics
  • Data governance requires everyone within the organization to accept a shared responsibility for both failure and success
  • This blog post will self-destruct in 10 seconds . . . Just kidding, I didn’t have the budget for special effects

 

Related Posts

Shared Responsibility

Quality and Governance are Beyond the Data

Video: Declaration of Data Governance

Don’t Do Less Bad; Do Better Good

Delivering Data Happiness

The Circle of Quality

The Diffusion of Data Governance

Jack Bauer and Enforcing Data Governance Policies

The Prince of Data Governance

MacGyver: Data Governance and Duct Tape

 

Quality and Governance are Beyond the Data

Last week’s episode of DM Radio on Information Management, co-hosted as always by Eric Kavanagh and Jim Ericson, was a panel discussion about how and why data governance can improve the quality of an organization’s data, and the featured guests were Dan Soceanu of DataFlux, Jim Orr of Trillium Software, Steve Sarsfield of Talend, and Brian Parish of iData.

The relationship between data quality and data governance is a common question, and perhaps mostly because data governance is still an evolving discipline.  However, another contributing factor is the prevalence of the word “data” in the names given to most industry disciplines and enterprise information initiatives.

“Data governance goes well beyond just the data,” explained Orr.  “Administration, business process, and technology are also important aspects, and therefore the term data governance can be misleading.”

“So perhaps a best practice of data governance is not calling it data governance,” remarked Ericson.

From my perspective, data governance involves policies, people, business processes, data, and technology.  However, all of those last four concepts (people, business process, data, and technology) are critical to every enterprise initiative.

So I agree with Orr because I think that the key concept differentiating data governance is its definition and enforcement of the policies that govern the complex ways that people, business processes, data, and technology interact.

As it relates to data quality, I believe that data governance provides the framework for evolving data quality from a project to an enterprise-wide program by facilitating the collaboration of business and technical stakeholders.  Data governance aligns data usage with business processes through business relevant metrics, and enables people to be responsible for, among other things, data ownership and data quality.

“A basic form of data governance is tying the data quality metrics to their associated business processes and business impacts,” explained Sarsfield, the author of the great book The Data Governance Imperative, which explains that “the mantra of data governance is that technologists and business users must work together to define what good data is by constantly leveraging both business users, who know the value of the data, and technologists, who can apply what the business users know to the data.”

Data is used as the basis to make critical business decisions, and therefore “the key for data quality metrics is the confidence level that the organization has in the data,” explained Soceanu.  Data-driven decisions are better than intuition-driven decisions, but lacking confidence about the quality of their data can lead organizations to rely more on intuition for their business decisions.

The Data Asset: How Smart Companies Govern Their Data for Business Success, written by Tony Fisher, the CEO of DataFlux, is another great book about data governance, which explains that “data quality is about more than just improving your data.  Ultimately, the goal is improving your organization.  Better data leads to better decisions, which leads to better business.  Therefore, the very success of your organization is highly dependent on the quality of your data.”

Data is a strategic corporate asset and, by extension, data quality and data governance are both strategic corporate disciplines, because high quality data serves as a solid foundation for an organization’s success, empowering people, enabled by technology, to make better business decisions and optimize business performance.

Therefore, data quality and data governance both go well beyond just improving the quality of an organization’s data, because Quality and Governance are Beyond the Data.

 

Related Posts

Video: Declaration of Data Governance

Don’t Do Less Bad; Do Better Good

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

Finding Data Quality

The Diffusion of Data Governance

MacGyver: Data Governance and Duct Tape

The Prince of Data Governance

Jack Bauer and Enforcing Data Governance Policies

Data Governance and Data Quality

The Data Quality of Dorian Gray

The Picture of Dorian Gray was a 19th century novel written by Oscar Wilde, which told the story of a young man who sold his soul to remain forever young and beautiful by having his recently painted portrait age rather than himself.  One of the allegories that can be drawn from the novel is our desire to cling, like Dorian Gray, to an idealized image of ourselves and of our lives.

I have previously blogged that when an organization’s data quality is discussed, it is very common to encounter data denial.

This is an understandable self-defense mechanism from the people responsible for business processes, technology, and data because of the simple fact that nobody likes to be blamed (or feel blamed) for causing or failing to fix data quality problems.

But data denial can also doom a data quality improvement initiative from the very beginning.

Of course, everyone will agree that ensuring high quality data is being used to make critical daily business decisions is vitally important to corporate success.  However, for an organization to improve its data quality, it has to admit that some of its business decisions are mistakes being made based on poor quality data.

But the organization has a desire to cling to an idealized image of its data and its data-driven business decisions, to treat its poor data quality the same way as Dorian Gray treated his portrait—by refusing to look at it.

However, The Data Quality of Dorian Gray is also a story that can only end in tragedy.

 

Related Posts

Once Upon a Time in the Data

The Data-Decision Symphony

The Idea of Order in Data

Hell is other people’s data

The Circle of Quality