Big Data and Big Analytics

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

Jill Dyché is the Vice President of Thought Leadership and Education at DataFlux.  Jill’s role at DataFlux is a combination of best-practice expert, key client advisor and all-around thought leader.  She is responsible for industry education, key client strategies and market analysis in the areas of data governance, business intelligence, master data management and customer relationship management.  Jill is a regularly featured speaker and the author of several books.

Jill’s latest book, Customer Data Integration: Reaching a Single Version of the Truth (Wiley & Sons, 2006), was co-authored with Evan Levy and shows the business breakthroughs achieved with integrated customer data.

Dan Soceanu is the Director of Product Marketing and Sales Enablement at DataFlux.  Dan manages global field sales enablement and product marketing, including product messaging and marketing analysis.  Prior to joining DataFlux in 2008, Dan has held marketing, partnership and market research positions with Teradata, General Electric and FormScape, as well as data management positions in the Financial Services sector.

Dan received his Bachelor of Science in Business Administration from Kutztown University of Pennsylvania, as well as earning his Master of Business Administration from Bloomsburg University of Pennsylvania.

On this episode of OCDQ Radio, Jill Dyché, Dan Soceanu, and I discuss the recent Pacific Northwest BI Summit, where the three core conference topics were Cloud, Collaboration, and Big Data, the last of which lead to a discussion about Big Analytics.


Big Data and Big Analytics

Additional listening options:


Related Posts

Listen to Jill Dyché on the Knights of the Data Roundtable

A Brave New Data World

Thaler’s Apples and Data Quality Oranges

Data Confabulation in Business Intelligence

Data In, Decision Out

The Data-Decision Symphony

The Real Data Value is Business Insight

Is your data complete and accurate, but useless to your business?

Data, data everywhere, but where is data quality?

Finding Data Quality


You Can’t Always Get the Data You Want

To Our Data Perfectionists

Organizing for Data Quality

OCDQ Radio is a vendor-neutral podcast about data quality and its related disciplines, produced and hosted by Jim Harris.

Dr. Thomas C. Redman (the “Data Doc”) is an innovator, advisor and teacher.  He was first to extend quality principles to data and information in the late 80s.  Since then he has crystallized a body of tools, techniques, roadmaps and organizational insights that help organizations make order-of-magnitude improvements.

More recently Tom has developed keen insights into the nature of data and formulated the first comprehensive approach to “putting data to work.”  Taken together, these enable organizations to treat data as assets of virtually unlimited potential.

Tom has personally helped dozens of leaders and organizations better understand data and data quality and start their data programs.  He is a sought-after lecturer and the author of dozens of papers and four books.  The most recent, Data Driven: Profiting from Your Most Important Business Asset (Harvard Business Press, 2008) was a Library Journal best buy of 2008.

Prior to forming Navesink Consulting Group in 1996, Tom conceived the Data Quality Lab at AT&T Bell Laboratories in 1987 and led it until 1995.  Tom holds a Ph.D. in statistics from Florida State University.  He holds two patents.

On this episode of OCDQ Radio, Tom Redman and I discuss concepts from his Data Governance and Information Quality 2011 post-conference tutorial about organizing for data quality, which includes his call to action for your role in the data revolution.


Organizing for Data Quality

Additional listening options:


Related Posts

Beyond a “Single Version of the Truth”

Poor Data Quality is a Virus

DQ-Tip: “Don't pass bad data on to the next person...”

Hyperactive Data Quality (Second Edition)

Data Profiling Early and Often

The Art of Data Matching

Data Quality Pro

Data Governance Star Wars

Master Data Management in Practice

A Brave New Data World

Data Governance and Information Quality 2011

Last week, I attended the Data Governance and Information Quality 2011 Conference, which was held June 27-30 in San Diego, California at the Catamaran Resort Hotel and Spa.

In this blog post, I summarize a few of the key points from some of the sessions I attended.  I used Twitter to help me collect my notes, and you can access the complete archive of my conference tweets on Twapper Keeper.


Assessing Data Quality Maturity

In his pre-conference tutorial, David Loshin, author of the book The Practitioner’s Guide to Data Quality Improvement, described five stages comprising a continuous cycle of data quality improvement:

  1. Identify and measure how poor data quality impedes business objectives
  2. Define business-related data quality rules and performance targets
  3. Design data quality improvement processes that remediate business process flaws
  4. Implement data quality improvement methods
  5. Monitor data quality against targets


Getting Started with Data Governance

Oliver Claude from Informatica provided some tips for making data governance a reality:

  • Data Governance requires acknowledging People, Process, and Technology are interlinked
  • You need to embed your data governance policies into your operational business processes
  • Data Governance must be Business-Centric, Technology-Enabled, and Business/IT Aligned


Data Profiling: An Information Quality Fundamental

Danette McGilvray, author of the book Executing Data Quality Projects, shared some of her data quality insights:

  • Although the right technology is essential, data quality is more than just technology
  • Believing tools cause good data quality is like believing X-Ray machines cause good health
  • Data Profiling is like CSI — Investigating the Poor Data Quality Crime Scene


Building Data Governance and Instilling Data Quality

In the opening keynote address, Dan Hartley of ConAgra Foods shared his data governance and data quality experiences:

  • It is important to realize that data governance is a journey, not a destination
  • One of the commonly overlooked costs of data governance is the cost of inaction
  • Data governance must follow a business-aligned and business-value-driven approach
  • Data governance is as much about change management as it is anything else
  • Data governance controls must be carefully balanced so they don’t disrupt business processes
  • Common Data Governance Challenge: Balancing Data Quality and Speed (i.e., Business Agility)
  • Common Data Governance Challenge: Picking up Fumbles — Balls dropped between vertical organizational silos
  • Bad business processes cause poor data quality
  • Better Data Quality = A Better Bottom Line
  • One of the most important aspects of Data Governance and Data Quality — Wave the Flag of Success


Practical Data Governance

Winston Chen from Kalido discussed some aspects of delivering tangible value with data governance:

  • Data governance is the business process of defining, implementing, and enforcing data policies
  • Every business process can be improved by feeding it better data
  • Data Governance is the Horse, not the Cart, i.e., Data Governance drives MDM and Data Quality
  • Data Governance needs to balance Data Silos (Local Authority) and Data Cathedrals (Central Control)


The Future of Data Governance and Data Quality

The closing keynote panel, moderated by Danette McGilvray, included the following insights:

  • David Plotkin: “It is not about Data, Process, or Technology — It is about People”
  • John Talburt: “For every byte of Data, we need 1,000 bytes of Metadata to go along with it”
  • C. Lwanga Yonke: “One of the most essential skills is the ability to lead change”
  • John Talburt: “We need to be focused on business-value-based data governance and data quality”
  • C. Lwanga Yonke: “We must be multilingual: Speak Data/Information, Business, and Technology”


Organizing for Data Quality

In his post-conference tutorial, Tom Redman, author of the book Data Driven, described ten habits of those with the best data:

  1. Focus on the most important needs of the most important customers
  2. Apply relentless attention to process
  3. Manage all critical sources of data, including external suppliers
  4. Measure data quality at the source and in business terms
  5. Employ controls at all levels to halt simple errors and establish a basis for moving forward
  6. Develop a knack for continuous improvement
  7. Set and achieve aggressive targets for improvement
  8. Formalize management accountabilities for data
  9. Lead the effort using a broad, senior group
  10. Recognize that the hard data quality issues are soft and actively manage the needed cultural changes


Tweeps Out at the Ball Game

As I mentioned earlier, I used Twitter to help me collect my notes, and you can access the complete archive of my conference tweets on Twapper Keeper.

But I wasn’t the only data governance and data quality tweep at the conference.  Steve Sarsfield, April Reeve, and Joe Dos Santos were also attending and tweeting.

However, on Tuesday night, we decided to take a timeout from tweeting, and instead became Tweeps out at the Ball Game by attending the San Diego Padres and Kansas Royals baseball game at PETCO Park.

We sang Take Me Out to the Ball Game, bought some peanuts and Cracker Jack, and root, root, rooted for the home team, which apparently worked since Padres closer Heath Bell got one, two, three strikes, you’re out on Royals third baseman Wilson Betemit, and the San Diego Padres won the game by a final score of 4-2.

So just like at the Data Governance and Information Quality 2011 Conference, a good time was had by all.  See you next year!


Related Posts

Stuck in the Middle with Data Governance

DQ-BE: Invitation to Duplication

TDWI World Conference Orlando 2010

Light Bulb Moments at DataFlux IDEAS 2010

Enterprise Data World 2010

Enterprise Data World 2009

TDWI World Conference Chicago 2009

DataFlux IDEAS 2009

Stuck in the Middle with Data Governance

Perhaps the most common debate about data governance is whether it should be started from the top down or the bottom up.

Data governance requires the coordination of a complex combination of a myriad of factors, including executive sponsorship, funding, decision rights, arbitration of conflicting priorities, policy definition, policy implementation, data quality remediation, data stewardship, business process optimization, technology, policy enforcement—and obviously many other factors as well.

This common debate is understandable since some of these data governance success factors are mostly top-down (e.g., funding), and some of these data governance success factors are mostly bottom-up (e.g., data quality remediation and data stewardship).

However, the complexity that stymies many organizations is most data governance success factors are somewhere in the middle.


Stuck in the Middle with Data Governance

At certain times during the evolution of a data governance program, top-down aspects will be emphasized, and at other times, bottom-up aspects will be emphasized.  So whether you start from the top down or the bottom up, eventually you are going to need to blend together top-down and bottom-up aspects in order to sustain an ongoing and pervasive data governance program.

To paraphrase The Beatles, when you get to the bottom, you go back to the top, where you stop and turn, and you go for a ride until you get to the bottom—and then you do it again.  (But hopefully your program doesn’t get code-named: “Helter Skelter”)

But after some initial progress has been made, to paraphrase Stealers Wheel, people within the organization may start to feel like we have top-down to the left of us, bottom-up to the right to us, and here we are—stuck in the middle with data governance.

In other words, although data governance is never a direct current only flowing in one top-down or bottom-up direction, but instead continually flows in an alternating current between top-down and bottom-up, when this dynamic is not communicated to everyone throughout the organization, progress is disrupted by people waiting around for someone else to complete the circuit.

But when, paraphrasing Pearl Jam, data governance is taken up by the middle—then there ain’t gonna be any middle any more.

In other words, when data governance pervades every level of the organization, everyone stops thinking in terms of top-down and bottom-up, and acts like an enterprise in the midst of sustaining the momentum of a successful data governance program.


Data Governance Conference

DGIQ Event Button

Next week, I will be attending the Data Governance and Information Quality Conference, which will be held June 27-30 in San Diego, California at the Catamaran Resort Hotel and Spa.

If you will also be attending, and you want to schedule a meeting with me: Contact me via email

If you will not be attending, you can follow the conference tweets using the hashtag: #DGIQ2011


Related Posts

Data Governance Star Wars: Balancing Bureaucracy And Agility

Council Data Governance

DQ-View: Roman Ruts on the Road to Data Governance

The Data Governance Oratorio

Zig-Zag-Diagonal Data Governance

Data Governance and the Buttered Cat Paradox

Beware the Data Governance Ides of March

A Tale of Two G’s

The People Platform

Rise of the Datechnibus

The Collaborative Culture of Data Governance

Connect Four and Data Governance

The Role Of Data Quality Monitoring In Data Governance

Quality and Governance are Beyond the Data

Data Transcendentalism

Podcast: Data Governance is Mission Possible

Video: Declaration of Data Governance

Don’t Do Less Bad; Do Better Good

Jack Bauer and Enforcing Data Governance Policies

The Prince of Data Governance

MacGyver: Data Governance and Duct Tape

The Diffusion of Data Governance

DQ-BE: Invitation to Duplication

Data Quality By Example (DQ-BE) is an OCDQ regular segment that provides examples of data quality key concepts.

I recently received my invitation to the Data Governance and Information Quality Conference, which will be held June 27-30 in San Diego, California at the Catamaran Resort Hotel and Spa.  Well, as shown above, I actually received both of my invitations.

Although my postal address is complete, accurate, and exactly the same on both of the invitations, my name is slightly different (“James” vs. “Jim”), and my title (“Data Quality Journalist” vs. “Blogger-in-Chief”) and company (“IAIDQ” vs. “OCDQ Blog”) are both completely different.  I wonder how many of the data quality software vendors sponsoring this conference would consider my invitations to be duplicates.  (Maybe I’ll use the invitations to perform a vendor evaluation on the exhibit floor.)

So it would seem that even “The Premier Event in Data Governance and Data Quality” can experience data quality problems.

No worries, I doubt the invitation system will be one of the “Practical Approaches and Success Stories” presented—unless it’s used as a practical approach to a success story about demonstrating how embarrassing it might be to send duplicate invitations to a data quality journalist and blogger-in-chief.  (I wonder if this blog post will affect the approval of my Press Pass for the event.)


DGIQ Event Button Okay, on a far more serious note, you should really consider attending this event.  As the conference agenda shows, there will be great keynote presentations, case studies, tutorials, and other sessions conducted by experts in data governance and data quality, including (among many others) Larry English, Danette McGilvray, Mike Ferguson, David Loshin, and Thomas Redman.


Related Posts

DQ-BE: Dear Valued Customer

Customer Incognita

Identifying Duplicate Customers

Adventures in Data Profiling (Part 7) – Customer Name

The Quest for the Golden Copy (Part 3) – Defining “Customer”

‘Tis the Season for Data Quality

The Seven Year Glitch

DQ-IRL (Data Quality in Real Life)

Data Quality, 50023

Once Upon a Time in the Data

The Semantic Future of MDM

TDWI World Conference Orlando 2010

Last week I attended the TDWI World Conference held November 7-12 in Orlando, Florida at the Loews Royal Pacific Resort.

As always, TDWI conferences offer a variety of full-day and half-day courses taught in an objective, vendor-neutral manner, designed for professionals and taught by in-the-trenches practitioners who are well known in the industry.

In this blog post, I summarize a few key points from two of the courses I attended.  I used Twitter to help me collect my notes, and you can access the complete archive of my conference tweets on Twapper Keeper.


A Practical Guide to Analytics

Wayne Eckerson, author of the book Performance Dashboards: Measuring, Monitoring, and Managing Your Business, described the four waves of business intelligence:

  1. Reporting – What happened?
  2. Analysis – Why did it happen?
  3. Monitoring – What’s happening?
  4. Prediction – What will happen?

“Reporting is the jumping off point for analytics,” explained Eckerson, “but many executives don’t realize this.  The most powerful aspect of analytics is testing our assumptions.”  He went on to differentiate the two strains of analytics:

  1. Exploration and Analysis – Top-down and deductive, primarily uses query tools
  2. Prediction and Optimization – Bottom-up and inductive, primarily uses data mining tools

“A huge issue for predictive analytics is getting people to trust the predictions,” remarked Eckerson.  “Technology is the easy part, the hard part is selling the business benefits and overcoming cultural resistance within the organization.”

“The key is not getting the right answers, but asking the right questions,” he explained, quoting Ken Rudin of Zynga.

“Deriving insight from its unique information will always be a competitive advantage for every organization.”  He recommended the book Competing on Analytics: The New Science of Winning as a great resource for selling the business benefits of analytics.


Data Governance for BI Professionals

Jill Dyché, a partner and co-founder of Baseline Consulting, explained that data governance transcends business intelligence and other enterprise information initiatives such as data warehousing, master data management, and data quality.

“Data governance is the organizing framework,” explained Dyché, “for establishing strategy, objectives, and policies for corporate data.  Data governance is the business-driven policy making and oversight of corporate information.”

“Data governance is necessary,” remarked Dyché, “whenever multiple business units are sharing common, reusable data.”

“Data governance aligns data quality with business measures and acceptance, positions enterprise data issues as cross-functional, and ensures data is managed separately from its applications, thereby evolving data as a service (DaaS).”

In her excellent 2007 article Serving the Greater Good: Why Data Hoarding Impedes Corporate Growth, Dyché explained the need for “systemizing the notion that data – corporate asset that it is – belongs to everyone.”

“Data governance provides the decision rights around the corporate data asset.”


Related Posts

DQ-View: From Data to Decision

Podcast: Data Governance is Mission Possible

The Business versus IT—Tear down this wall!

MacGyver: Data Governance and Duct Tape

Live-Tweeting: Data Governance

Enterprise Data World 2010

Enterprise Data World 2009

TDWI World Conference Chicago 2009

Light Bulb Moments at DataFlux IDEAS 2010

DataFlux IDEAS 2009

DQ-Tip: “Data quality tools do not solve data quality problems...”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“Data quality tools do not solve data quality problems—People solve data quality problems.”

This DQ-Tip came from the DataFlux IDEAS 2010 Assessing Data Quality Maturity workshop conducted by David Loshin, whose new book The Practitioner's Guide to Data Quality Improvement will be released next month.

Just like all technology, data quality tools are enablers.  Data quality tools provide people with the capability for solving data quality problems, for which there are no fast and easy solutions.  Although incredible advancements in technology continue, there are no Magic Beans for data quality.

And there never will be.

An organization’s data quality initiative can only be successful when people take on the challenge united by collaboration, guided by an effective methodology, and of course, enabled by powerful technology.

By far the most important variable in implementing successful and sustainable data quality improvements is acknowledging David’s sage advice:  people—not tools—solve data quality problems.


Related Posts

DQ-Tip: “There is no such thing as data accuracy...”

DQ-Tip: “Data quality is primarily about context not accuracy...”

DQ-Tip: “There is no point in monitoring data quality...”

DQ-Tip: “Don't pass bad data on to the next person...”

DQ-Tip: “...Go talk with the people using the data”

DQ-Tip: “Data quality is about more than just improving your data...”

DQ-Tip: “Start where you are...”

Scrum Screwed Up

This was the inaugural cartoon on Implementing Scrum by Michael Vizdos and Tony Clark, which does a great job of illustrating the fable of The Chicken and the Pig used to describe the two types of roles involved in Scrum, which, quite rare for our industry, is not an acronym, but one common approach among many iterative, incremental frameworks for agile software development.

Scrum is also sometimes used as a generic synonym for any agile framework.  Although I’m not an expert, I’ve worked on more than a few agile programs.  And since I am fond of metaphors, I will use the Chicken and the Pig to describe two common ways that scrums of all kinds can easily get screwed up:

  1. All Chicken and No Pig
  2. All Pig and No Chicken

However, let’s first establish a more specific context for agile development using one provided by a recent blog post on the topic.


A Contrarian’s View of Agile BI

In her excellent blog post A Contrarian’s View of Agile BI, Jill Dyché took a somewhat unpopular view of a popular view, which is something that Jill excels at—not simply for the sake of doing it—because she’s always been well-known for telling it like it is.

In preparation for the upcoming TDWI World Conference in San Diego, Jill was pondering the utilization of agile methodologies in business intelligence (aka BI—ah, there’s one of those oh so common industry acronyms straight out of The Acronymicon).

The provocative TDWI conference theme is: “Creating an Agile BI Environment—Delivering Data at the Speed of Thought.”

Now, please don’t misunderstand.  Jill is an advocate for doing agile BI the right way.  And it’s certainly understandable why so many organizations love the idea of agile BI.  Especially when you consider the slower time to value of most other approaches when compared with, following Jill’s rule of thumb, how agile BI would have “either new BI functionality or new data deployed (at least) every 60-90 days.  This approach establishes BI as a program, greater than the sum of its parts.”

“But in my experience,” Jill explained, “if the organization embracing agile BI never had established BI development processes in the first place, agile BI can be a road to nowhere.  In fact, the dirty little secret of agile BI is this: It’s companies that don’t have the discipline to enforce BI development rigor in the first place that hurl themselves toward agile BI.”

“Peek under the covers of an agile BI shop,” Jill continued, “and you’ll often find dozens or even hundreds of repeatable canned BI reports, but nary an advanced analytics capability. You’ll probably discover an IT organization that failed to cultivate solid relationships with business users and is now hiding behind an agile vocabulary to justify its own organizational ADD. It’s lack of accountability, failure to manage a deliberate pipeline, and shifting work priorities packaged up as so much scrum.”

I really love the term Organizational Attention Deficit Disorder, and in spite of myself, I can’t help but render it acronymically as OADD—which should be pronounced as “odd” because the “a” is silent, as in: “Our organization is really quite OADD, isn’t it?”


Scrum Screwed Up: All Chicken and No Pig

Returning to the metaphor of the Scrum roles, the pigs are the people with their bacon in the game performing the actual work, and the chickens are the people to whom the results are being delivered.  Most commonly, the pigs are IT or the technical team, and the chickens are the users or the business team.  But these scrum lines are drawn in the sand, and therefore easily crossed.

Many organizations love the idea of agile BI because they are thinking like chickens and not like pigs.  And the agile life is always easier for the chicken because they are only involved, whereas the pig is committed.

OADD organizations often “hurl themselves toward agile BI” because they’re enamored with the theory, but unrealistic about what the practice truly requires.  They’re all-in when it comes to the planning, but bacon-less when it comes to the execution.

This is one common way that OADD organizations can get Scrum Screwed Up—they are All Chicken and No Pig.


Scrum Screwed Up: All Pig and No Chicken

Closer to the point being made in Jill’s blog post, IT can pretend to be pigs making seemingly impressive progress, but although they’re bringing home the bacon, it lacks any real sizzle because it’s not delivering any real advanced analytics to business users. 

Although they appear to be scrumming, IT is really just screwing around with technology, albeit in an agile manner.  However, what good is “delivering data at the speed of thought” when that data is neither what the business is thinking, nor truly needs?

This is another common way that OADD organizations can get Scrum Screwed Up—they are All Pig and No Chicken.


Scrum is NOT a Silver Bullet

Scrum—and any other agile framework—is not a silver bullet.  However, agile methodologies can work—and not just for BI.

But whether you want to call it Chicken-Pig Collaboration, or Business-IT Collaboration, or Shiny Happy People Holding Hands, a true enterprise-wide collaboration facilitated by a cross-disciplinary team is necessary for any success—agile or otherwise.

Agile frameworks, when implemented properly, help organizations realistically embrace complexity and avoid oversimplification, by leveraging recurring iterations of relatively short duration that always deliver data-driven solutions to business problems. 

Agile frameworks are successful when people take on the challenge united by collaboration, guided by effective methodology, and supported by enabling technology.  Agile frameworks allow the enterprise to follow what works, for as long as it works, and without being afraid to adjust as necessary when circumstances inevitably change.

For more information about Agile BI, follow Jill Dyché and TDWI World Conference in San Diego, August 15-20 via Twitter.

Enterprise Data World 2010

Enterprise Data World 2010

Enterprise Data World 2010 was held March 14-18 in San Francisco, California at the Hilton San Francisco Union Square.

Congratulations and thanks to Tony Shaw, Maya Stosskopf, the entire Wilshire Conferences staff, as well as Cathy Nolan and everyone with DAMA International, for their outstanding efforts on delivering yet another wonderful conference experience.

I wish I could have attended every session on the agenda, but this blog post provides some quotes from a few of my favorites.


Applying Agile Software Engineering Principles to Data Governance

Conference session by Marty Moseley, CTO of Initiate Systems, an IBM company.

Quotes from the session:

  • “Data governance is 80% people and only 20% technology”
  • “Data governance is an ongoing, evolutionary practice”
  • “There are some organizational problems that are directly caused by poor data quality”
  • “Build iterative 'good enough' solutions – not 'solve world hunger' efforts”
  • “Traditional approaches to data governance try to 'boil the ocean' and solve every data problem”
  • “Agile approaches to data governance laser focus on iteratively solving one problem at a time”
  • “Quality is everything, don't sacrifice accuracy for performance, you can definitely have both”

Seven iterative steps of Agile Data Governance:

  1. “Form the Data Governance Board – Small guidance team of executives who can think cross-organizationally”
  2. “Define the Problem and the Team – Root cause analysis, build the business case, appoint necessary resources”
  3. “Nail Down Size and Scope – Prioritize the scope in order to implement the current iteration in less than 9 months”
  4. “Validate Your Assumptions – Challenge all estimates, perform data profiling, list data quality issues to resolve”
  5. “Establishing Data Policies – Measurable statements of 'what must be achieved' for which kinds of data”
  6. “Implement the data quality solution for the current iteration”
  7. “Evaluate the overall progress and plan for the next iteration”


Monitor the Quality of your Master Data

Conference session by Thomas Ravn, MDM Practice Director at Platon.

Quotes from the session:

  • “Ensure master data is taken into account each and every time a business process or IT system is changed”
  • “Web forms requiring master data attributes can NOT be based on a single country's specific standards”
  • “There is no point in monitoring data quality if no one within the business feels responsible for it”
  • “The greater the business impact of a data quality dimension, the more difficult it is to measure”
  • “Data quality key performance indicators (KPI) should be tied directly to business processes”
  • “Implement a data input validation rule rather than allow bad data to be entered”
  • “Sometimes the business logic is too ambiguous to be enforced by a single data input validation rule”
  • “Data is not always clean or dirty in itself – it depends on the viewpoint or defined standard”
  • “Data quality is in the eye of the beholder”


Measuring the Business Impact of Data Governance

Conference session by Tony Fisher, CEO of DataFlux, and Dr. Walid el Abed, CEO of Global Data Excellence.

Quotes from the session:

  • “The goal of data governance is to position the business to improve”
  • “Revenue optimization, cost control, and risk mitigation are the business drivers of data management”
  • “You don't manage data to manage data, you manage data to improve your business”
  • “Business rules are rules that data should comply with in order to have the process execute properly”
  • “For every business rule, define the main impact (cost of failure) and the business value (result of success)”
  • “Power Shift – Before: Having information is power – Now: Sharing information is power”
  • “You must translate technical details into business language, such as cost, revenue, risk”
  • “Combine near-term fast to value with long-term alignment with business strategy”
  • “Data excellence must be a business value added driven program”
  • “Communication is key to data excellence, make it visible and understood by all levels of the organization”


The Effect of the Financial Meltdown on Data Management

Conference session by April Reeve, Consultant at EMC Consulting.

Quotes from the session:

  • “The recent financial crisis has greatly increased the interest in both data governance and data transparency”
  • “Data Governance is a symbiotic relationship of Business Governance and Technology Governance”
  • “Risk management is a data problem in the forefront of corporate concern – now viewing data as a corporate asset”
  • “Data transparency increases the criticality of data quality – especially regarding the accuracy of financial reporting”


What the Business Wants

Closing Keynote Address by Graeme Simsion, Principal at Simsion & Associates.

Quotes from the keynote:

  • “You can get a lot done if you don't care who gets the credit”
  • “People will work incredibly hard to implement their own ideas”
  • “What if we trust the business to know what's best for the business?”
  • “Let's tell the business what we (as data professionals) do – and then ask the business what they want”


Social Karma

My Badge for Enterprise Data World 2010

I presented this session about the art of effectively using social media in business.

An effective social media strategy is essential for organizations as well as individual professionals.  Using social media effectively can definitely help promote you, your expertise, your company, and its products and services. However, too many businesses and professionals have a selfish social media strategy.  You should not use social media to exclusively promote only yourself or your business. 

You need to view social media as Social Karma.

For free related content with no registration required, click on this link: Social Karma


Live-Tweeting at Enterprise Data World 2010

Twitter at Enterprise Data World 2010

The term “live-tweeting” describes using Twitter to provide near real-time reporting from an event.  When a conference schedule has multiple simultaneous sessions, Twitter is great for sharing insights from the sessions you are in with other conference attendees at other sessions, as well as with the on-line community not attending the conference.

Enterprise Data World 2010 had a great group of tweeps (i.e., people using Twitter) and I want to thank all of them, and especially the following Super-Tweeps in particular:   

Karen Lopez – @datachick

April Reeve – @Datagrrl

Corinna Martinez – @Futureratti

Eva Smith – @datadeva

Alec Sharp – @alecsharp

Ted Louie – @tedlouie

Rob Drysdale – @projmgr

Loretta Mahon Smith – @silverdata 


Additional Resources

Official Website for DAMA International

LinkedIn Group for DAMA International

Twitter Account for DAMA International

Facebook Group for DAMA International

Official Website for Enterprise Data World 2010

LinkedIn Group for Enterprise Data World

Twitter Account for Enterprise Data World

Facebook Group for Enterprise Data World 

Enterprise Data World 2011 will take place in Chicago, Illinois at the Chicago Sheraton and Towers on April 3-7, 2011.


Related Posts

Enterprise Data World 2009

TDWI World Conference Chicago 2009

DataFlux IDEAS 2009

Social Karma (Part 1)

An effective social media strategy is essential for organizations as well as individual professionals.

Using social media effectively, including blogging and social networking sites (e.g., Twitter, Facebook, LinkedIn), can definitely help promote you, your expertise, your company, and its products and services. 

However, it is sad—but true—that too many people and companies have a selfish social media strategy. 

You should not use social media to exclusively promote only yourself or your business. 

You need to view social media as Social Karma

If you can focus your social media and social networking efforts on helping others, then you will get much more back than just a blog reader, a LinkedIn connection, a Facebook friend, a Twitter follower, or even a potential customer.


I am not a Social Media Expert—but I play one on the Internet

I am not a social media “expert.”  In fact, until late 2008, I wasn't even interested enough to ask people what they meant when I heard them talking about “social media.”  I started blogging, tweeting, and using other social media in early 2009. 

Please let me do the complex math for you—I still have less than one year of actual experience with social media.

I don't know how you define expertise—and I do acknowledge the inherent difficulty in vetting expertise in such a new and rapidly evolving field—but less than one year of experience with anything does not an expert make, in my humble opinion.

However, I have spent over 15 years in computer science and information technology related disciplines, as a software engineer, consultant, and instructor.  I have considerable experience and expertise applying technology in a business context in order to implement solutions for Global 500 companies in a wide variety of industries. 

Therefore, I am not a complete moron—but I will leave it to you to determine the actual percentage.

I am currently a full-time writer making all of my income from social media—mainly from blogging and mostly from ghostwriting for corporate blogs.

I am not trying to sell you anything. 

I am going to freely share what I have learned so far, including what I have learned from people with far more experience using social media.  As I stated previously, I hesitate to call anyone an expert in such a rapidly evolving discipline, but I will mention several resources I have found helpful. 

I have absolutely no affiliation or any paid relationship with any person, website, event, product, or book that I recommend.


About This Series

The primary reason that I am organizing my thoughts about social media involves my preparation for an upcoming conference presentation about using social media effectively for business purposes (more details in the next section).

I am publishing this content as a series on my blog, not only to provide supporting material for the small group of people that actually attend my conference session, but also because I have learned firsthand how the two-way conversation that blogging provides via comments from my readers, greatly improves the quality of my material.

Throughout this series, I will combine traditional blog posts with presentation slides, podcasts, and videos, in order to build a multimedia library of supporting material—all freely available, no registration required.


Enterprise Data World 2010

EDW10 Speaker Badge

Enterprise Data World is the business world’s most comprehensive vendor-neutral educational event about data and information management.  This year’s program will be bigger than ever before, with more sessions, more case studies, and more can’t-miss content, providing over 200 hours of in-depth tutorials, hands-on workshops, practical sessions and insightful keynotes to take you to the forefront of your industry.   

Enterprise Data World 2010 will be held March 14-18 in San Francisco, California at the Hilton San Francisco Union Square.

The full conference agenda can be viewed by clicking on this link: Enterprise Data World 2010 Conference Agenda.

The registration options can be viewed by clicking on this link: Enterprise Data World 2010 Conference Registration

Use the discount code of EDW10SPKR for a $100 discount off your registration fees. (Discount code expires on February 26.)

On Monday, March 15 from 5:00 PM – 6:00 PM, I will be presenting (30 minutes of material and 30 minutes of Q&A):

Social Karma: The Art of Effectively Using Social Media in Business

In Part 2 of this series:  We will discuss leveraging social media for “listening purposes only” as a passive (and safe) way to determine what (if any) type of active involvement with social media makes sense for you and/or your company.


Related Posts

Social Karma (Part 2) – Social Media Preparation

Social Karma (Part 3) – Listening Stations, Home Base, and Outposts

Social Karma (Part 4) – Blogging Best Practices

Social Karma (Part 5) – Connection, Engagement, and ROI Basics

Social Karma (Part 6) – Social Media Books

Social Karma (Part 7) – Twitter

Live-Tweeting: Data Governance

The term “live-tweeting” describes using Twitter to provide near real-time reporting from an event.  I live-tweet from the sessions I attend at industry conferences as well as interesting webinars.

Recently, I live-tweeted Successful Data Stewardship Through Data Governance, which was a data governance webinar featuring Marty Moseley of Initiate Systems and Jill Dyché of Baseline Consulting.

Instead of writing a blog post summarizing the webinar, I thought I would list my tweets with brief commentary.  My goal is to provide an example of this particular use of Twitter so you can decide its value for yourself.


As the webinar begins, Marty Moseley and Jill Dyché provide some initial thoughts on data governance:

Live-Tweets 1


Jill Dyché provides a great list of data governance myths and facts:

Live-Tweets 2


Jill Dyché provides some data stewardship insights:

Live-Tweets 3


As the webinar ends, Marty Moseley and Jill Dyché provide some closing thoughts about data governance and data quality:

Live-Tweets 4


Please Share Your Thoughts

If you attended the webinar, then you know additional material was presented.  Did my tweets do the webinar justice?  Did you follow along on Twitter during the webinar?  If you did not attend the webinar, then are these tweets helpful?

What are your thoughts in general regarding the pros and cons of live-tweeting? 


Related Posts

The following three blog posts are conference reports based largely on my live-tweets from the events:

Enterprise Data World 2009

TDWI World Conference Chicago 2009

DataFlux IDEAS 2009

DQ-Tip: “...Go talk with the people using the data”

Data Quality (DQ) Tips is an OCDQ regular segment.  Each DQ-Tip is a clear and concise data quality pearl of wisdom.

“In order for your data quality initiative to be successful, you must:

Walk away from the computer and go talk with the people using the data.”

This DQ-Tip came from the TDWI World Conference Chicago 2009 presentation Modern Data Quality Techniques in Action by Gian Di Loreto from Loreto Services and Technologies.

As I blogged about in Data Gazers (borrowing that excellent phrase from Arkady Maydanchik), within cubicles randomly dispersed throughout the sprawling office space of companies large and small, there exist countless unsung heroes of data quality initiatives.  Although their job titles might be labeling them as a Business Analyst, Programmer Analyst, Account Specialist or Application Developer, their true vocation is a far more noble calling.  They are Data Gazers.

A most bizarre phenomenon (that I have witnessed too many times) is that as a data quality initiative “progresses” it tends to get further and further away from the people who use the data on a daily basis.

Please follow the excellent advice of Gian and Arkady — go talk with your users. 

Trust me — everyone on your data quality initiative will be very happy that you did.


Related Posts

DQ-Tip: “Data quality is primarily about context not accuracy...”

DQ-Tip: “Don't pass bad data on to the next person...”

Worthy Data Quality Whitepapers (Part 1)

In my April blog post Data Quality Whitepapers are Worthless, I called for data quality whitepapers that are worth reading.

This post will be the first in an ongoing series about data quality whitepapers that I have read and can endorse as worthy.


It is about the data – the quality of the data

This is the subtitle of two brief but informative data quality whitepapers freely available (no registration required) from the Electronic Commerce Code Management Association (ECCMA)Transparency and Data Portability.



ECCMA is an international association of industry and government master data managers working together to increase the quality and lower the cost of descriptions of individuals, organizations, goods and services through developing and promoting International Standards for Master Data Quality. 

Formed in April 1999, ECCMA has brought together thousands of experts from around the world and provides them a means of working together in the fair, open and extremely fast environment of the Internet to build and maintain the global, open standard dictionaries that are used to unambiguously label information.  The existence of these dictionaries of labels allows information to be passed from one computer system to another without losing meaning.


Peter Benson

The author of the whitepapers is Peter Benson, the Executive Director and Chief Technical Officer of the ECCMA.  Peter is an expert in distributed information systems, content encoding and master data management.  He designed one of the very first commercial electronic mail software applications, WordStar Messenger and was granted a landmark British patent in 1992 covering the use of electronic mail systems to maintain distributed databases.

Peter designed and oversaw the development of a number of strategic distributed database management systems used extensively in the UK and US by the Public Relations and Media Industries.  From 1994 to 1998, Peter served as the elected chairman of the American National Standards Institute Accredited Committee ANSI ASCX 12E, the Standards Committee responsible for the development and maintenance of EDI standard for product data.

Peter is known for the design, development and global promotion of the UNSPSC as an internationally recognized commodity classification and more recently for the design of the eOTD, an internationally recognized open technical dictionary based on the NATO codification system.

Peter is an expert in the development and maintenance of Master Data Quality as well as an internationally recognized proponent of Open Standards that he believes are critical to protect data assets from the applications used to create and manipulate them. 

Peter is the Project Leader for ISO 8000, which is a new international standard for data quality.

ISO 8000 is the international standards for data quality.  You can get more information by clicking on this link: ISO 8000


Whitepaper Excerpts

Excerpts from Transparency:

  • “Today, more than ever before, our access to data, the ability of our computer applications to use it and the ultimate accuracy of the data determines how we see and interact with the world we live and work in.”
  • “Data is intrinsically simple and can be divided into data that identifies and describes things, master data, and data that describes events, transaction data.”
  • “Transparency requires that transaction data accurately identifies who, what, where and when and master data accurately describes who, what and where.”


Excerpts from Data Portability:

  • “In an environment where the life cycle of software applications used to capture and manage data is but a fraction of the life cycle of the data itself, the issues of data portability and long-term data preservation are critical.”
  • “Claims that an application exports data in XML does address the syntax part of the problem, but that is the easy part.  What is required is to be able to export all of the data in a form that can be easily uploaded into another application.”
  • “In a world rapidly moving towards SaaS and cloud computing, it really pays to pause and consider not just the physical security of your data but its portability.”


TDWI World Conference Chicago 2009

Founded in 1995, TDWI (The Data Warehousing Institute™) is the premier educational institute for business intelligence and data warehousing that provides education, training, certification, news, and research for executives and information technology professionals worldwide.  TDWI conferences always offer a variety of full-day and half-day courses taught in an objective, vendor-neutral manner.  The courses taught are designed for professionals and taught by in-the-trenches practitioners who are well known in the industry.


TDWI World Conference Chicago 2009 was held May 3-8 in Chicago, Illinois at the Hyatt Regency Hotel and was a tremendous success.  I attended as a Data Quality Journalist for the International Association for Information and Data Quality (IAIDQ).

I used Twitter to provide live reporting from the conference.  Here are my notes from the courses I attended: 


BI from Both Sides: Aligning Business and IT

Jill Dyché, CBIP, is a partner and co-founder of Baseline Consulting, a management and technology consulting firm that provides data integration and business analytics services.  Jill is responsible for delivering industry and client advisory services, is a frequent lecturer and writer on the business value of IT, and writes the excellent Inside the Biz blog.  She is the author of acclaimed books on the business value of information: e-Data: Turning Data Into Information With Data Warehousing and The CRM Handbook: A Business Guide to Customer Relationship Management.  Her latest book, written with Evan Levy, is Customer Data Integration: Reaching a Single Version of the Truth.

Course Quotes from Jill Dyché:

  • Five Critical Success Factors for Business Intelligence (BI):
    1. Organization - Build organizational structures and skills to foster a sustainable program
    2. Processes - Align both business and IT development processes that facilitate delivery of ongoing business value
    3. Technology - Select and build technologies that deploy information cost-effectively
    4. Strategy - Align information solutions to the company's strategic goals and objectives
    5. Information - Treat data as an asset by separating data management from technology implementation
  • Three Different Requirement Categories:
    1. What is the business need, pain, or problem?  What business questions do we need to answer?
    2. What data is necessary to answer those business questions?
    3. How do we need to use the resulting information to answer those business questions?
  • “Data warehouses are used to make business decisions based on data – so data quality is critical”
  • “Even companies with mature enterprise data warehouses still have data silos - each business area has its own data mart”
  • “Instead of pushing a business intelligence tool, just try to get people to start using data”
  • “Deliver a usable system that is valuable to the business and not just a big box full of data”


TDWI Data Governance Summit

Philip Russom is the Senior Manager of Research and Services at TDWI, where he oversees many of TDWI’s research-oriented publications, services, and events.  Prior to joining TDWI in 2005, he was an industry analyst covering BI at Forrester Research, as well as a contributing editor with Intelligent Enterprise and Information Management (formerly DM Review) magazines.

Summit Quotes from Philip Russom:

  • “Data Governance usually boils down to some form of control for data and its usage”
  • “Four Ps of Data Governance: People, Policies, Procedures, Process”
  • “Three Pillars of Data Governance: Compliance, Business Transformation, Business Integration”
  • “Two Foundations of Data Governance: Business Initiatives and Data Management Practices”
  • “Cross-functional collaboration is a requirement for successful Data Governance”


Becky Briggs, CBIP, CMQ/OE, is a Senior Manager and Data Steward for Airlines Reporting Corporation (ARC) and has 25 years of experience in data processing and IT - the last 9 in data warehousing and BI.  She leads the program team responsible for product, project, and quality management, business line performance management, and data governance/stewardship.

Summit Quotes from Becky Briggs:

  • “Data Governance is the act of managing the organization's data assets in a way that promotes business value, integrity, usability, security and consistency across the company”
  • Five Steps of Data Governance:
    1. Determine what data is required
    2. Evaluate potential data sources (internal and external)
    3. Perform data profiling and analysis on data sources
    4. Data Services - Definition, modeling, mapping, quality, integration, monitoring
    5. Data Stewardship - Classification, access requirements, archiving guidelines
  • “You must realize and accept that Data Governance is a program and not just a project”


Barbara Shelby is a Senior Software Engineer for IBM with over 25 years of experience holding positions of technical specialist, consultant, and line management.  Her global management and leadership positions encompassed network authentication, authorization application development, corporate business systems data architecture, and database development.

Summit Quotes from Barbara Shelby:

  • Four Common Barriers to Data Governance:
    1. Information - Existence of information silos and inconsistent data meanings
    2. Organization - Lack of end-to-end data ownership and organization cultural challenges
    3. Skill - Difficulty shifting resources from operational to transformational initiatives
    4. Technology - Business data locked in large applications and slow deployment of new technology
  • Four Key Decision Making Bodies for Data Governance:
    1. Enterprise Integration Team - Oversees the execution of CIO funded cross enterprise initiatives
    2. Integrated Enterprise Assessment - Responsible for the success of transformational initiatives
    3. Integrated Portfolio Management Team - Responsible for making ongoing business investment decisions
    4. Unit Architecture Review - Responsible for the IT architecture compliance of business unit solutions


Lee Doss is a Senior IT Architect for IBM with over 25 years of information technology experience.  He has a patent for process of aligning strategic capability for business transformation and he has held various positions including strategy, design, development, and customer support for IBM networking software products.

Summit Quotes from Lee Doss:

  • Five Data Governance Best Practices:
    1. Create a sense of urgency that the organization can rally around
    2. Start small, grow fast...pick a few visible areas to set an example
    3. Sunset legacy systems (application, data, tools) as new ones are deployed
    4. Recognize the importance of organization culture…this will make or break you
    5. Always, always, always – Listen to your customers


Kevin Kramer is a Senior Vice President and Director of Enterprise Sales for UMB Bank and is responsible for development of sales strategy, sales tool development, and implementation of enterprise-wide sales initiatives.

Summit Quotes from Kevin Kramer:

  • “Without Data Governance, multiple sources of customer information can produce multiple versions of the truth”
  • “Data Governance helps break down organizational silos and shares customer data as an enterprise asset”
  • “Data Governance provides a roadmap that translates into best practices throughout the entire enterprise”


Kanon Cozad is a Senior Vice President and Director of Application Development for UMB Bank and is responsible for overall technical architecture strategy and oversees information integration activities.

Summit Quotes from Kanon Cozad:

  • “Data Governance identifies business process priorities and then translates them into enabling technology”
  • “Data Governance provides direction and Data Stewardship puts direction into action”
  • “Data Stewardship identifies and prioritizes applications and data for consolidation and improvement”


Jill Dyché, CBIP, is a partner and co-founder of Baseline Consulting, a management and technology consulting firm that provides data integration and business analytics services.  (For Jill's complete bio, please see above).

Summit Quotes from Jill Dyché:

  • “The hard part of Data Governance is the data
  • “No data will be formally sanctioned unless it meets a business need”
  • “Data Governance focuses on policies and strategic alignment”
  • “Data Management focuses on translating defined polices into executable actions”
  • “Entrench Data Governance in the development environment”
  • “Everything is customer data – even product and financial data”


Data Quality Assessment - Practical Skills

Arkady Maydanchik is a co-founder of Data Quality Group, a recognized practitioner, author, and educator in the field of data quality and information integration.  Arkady's data quality methodology and breakthrough ARKISTRA technology were used to provide services to numerous organizations.  Arkady is the author of the excellent book Data Quality Assessment, a frequent speaker at various conferences and seminars, and a contributor to many journals and online publications.  Data quality curriculum by Arkady Maydanchik can be found at eLearningCurve.

Course Quotes from Arkady Maydanchik:

  • “Nothing is worse for data quality than desperately trying to fix it during the last few weeks of an ETL project”
  • “Quality of data after conversion is in direct correlation with the amount of knowledge about actual data”
  • “Data profiling tools do not do data profiling - it is done by data analysts using data profiling tools”
  • “Data Profiling does not answer any questions - it helps us ask meaningful questions”
  • “Data quality is measured by its fitness to the purpose of use – it's essential to understand how data is used”
  • “When data has multiple uses, there must be data quality rules for each specific use”
  • “Effective root cause analysis requires not stopping after the answer to your first question - Keep asking: Why?”
  • “The central product of a Data Quality Assessment is the Data Quality Scorecard”
  • “Data quality scores must be both meaningful to a specific data use and be actionable”
  • “Data quality scores must estimate both the cost of bad data and the ROI of data quality initiatives”


Modern Data Quality Techniques in Action - A Demonstration Using Human Resources Data

Gian Di Loreto formed Loreto Services and Technologies in 2004 from the client services division of Arkidata Corporation.  Loreto Services provides data cleansing and integration consulting services to Fortune 500 companies.  Gian is a classically trained scientist - he received his PhD in elementary particle physics from Michigan State University.

Course Quotes from Gian Di Loreto:

  • “Data Quality is rich with theory and concepts – however it is not an academic exercise, it has real business impact”
  • “To do data quality well, you must walk away from the computer and go talk with the people using the data”
  • “Undertaking a data quality initiative demands developing a deeper knowledge of the data and the business”
  • “Some essential data quality rules are ‘hidden’ and can only be discovered by ‘clicking around’ in the data”
  • “Data quality projects are not about systems working together - they are about people working together”
  • “Sometimes, data quality can be ‘good enough’ for source systems but not when integrated with other systems”
  • “Unfortunately, no one seems to care about bad data until they have it”
  • “Data quality projects are only successful when you understand the problem before trying to solve it”


Mark Your Calendar

TDWI World Conference San Diego 2009 - August 2-7, 2009.

TDWI World Conference Orlando 2009 - November 1-6, 2009.

TDWI World Conference Las Vegas 2010 - February 21-26, 2010.