Customer Incognita

Many enterprise information initiatives are launched in order to unravel that riddle, wrapped in a mystery, inside an enigma, that great unknown, also known as...Customer.

Centuries ago, cartographers used the Latin phrase terra incognita (meaning “unknown land”) to mark regions on a map not yet fully explored.  In this century, companies simply can not afford to use the phrase customer incognita to indicate what information about their existing (and prospective) customers they don't currently have or don't properly understand.


What is a Customer?

First things first, what exactly is a customer?  Those happy people who give you money?  Those angry people who yell at you on the phone or say really mean things about your company on Twitter and Facebook?  Why do they have to be so mean? 

Mean people suck.  However, companies who don't understand their customers also suck.  And surely you don't want to be one of those companies, do you?  I didn't think so.

Getting back to the question, here are some insights from the Data Quality Pro discussion forum topic What is a customer?:

  • Someone who purchases products or services from you.  The word “someone” is key because it’s not the role of a “customer” that forms the real problem, but the precision of the term “someone” that causes challenges when we try to link other and more specific roles to that “someone.”  These other roles could be contract partner, payer, receiver, user, owner, etc.
  • Customer is a role assigned to a legal entity in a complete and precise picture of the real world.  The role is established when the first purchase is accepted from this real-world entity.  Of course, the main challenge is whether or not the company can establish and maintain a complete and precise picture of the real world.

These working definitions were provided by fellow blogger and data quality expert Henrik Liliendahl Sørensen, who recently posted 360° Business Partner View, which further examines the many different ways a real-world entity can be represented, including when, instead of a customer, the real-world entity represents a citizen, patient, member, etc.

A critical first step for your company is to develop your definition of a customer.  Don't underestimate either the importance or the difficulty of this process.  And don't assume it is simply a matter of semantics.

Some of my consulting clients have indignantly told me: “We don't need to define it, everyone in our company knows exactly what a customer is.”  I usually respond: “I have no doubt that everyone in your company uses the word customer, however I will work for free if everyone defines the word customer in exactly the same way.”  So far, I haven't had to work for free.  


How Many Customers Do You Have?

You have done the due diligence and developed your definition of a customer.  Excellent!  Nice work.  Your next challenge is determining how many customers you have.  Hopefully, you are not going to try using any of these techniques:

  • SELECT COUNT(*) AS "We have this many customers" FROM Customers
  • SELECT COUNT(DISTINCT Name) AS "No wait, we really have this many customers" FROM Customers
  • Middle-Square or Blum Blum Shub methods (i.e. random number generation)
  • Magic 8-Ball says: “Ask again later”

One of the most common and challenging data quality problems is the identification of duplicate records, especially redundant representations of the same customer information within and across systems throughout the enterprise.  The need for a solution to this specific problem is one of the primary reasons that companies invest in data quality software and services.

Earlier this year on Data Quality Pro, I published a five part series of articles on identifying duplicate customers, which focused on the methodology for defining your business rules and illustrated some of the common data matching challenges.

Topics covered in the series:

  • Why a symbiosis of technology and methodology is necessary when approaching this challenge
  • How performing a preliminary analysis on a representative sample of real data prepares effective examples for discussion
  • Why using a detailed, interrogative analysis of those examples is imperative for defining your business rules
  • How both false negatives and false positives illustrate the highly subjective nature of this problem
  • How to document your business rules for identifying duplicate customers
  • How to set realistic expectations about application development
  • How to foster a collaboration of the business and technical teams throughout the entire project
  • How to consolidate identified duplicates by creating a “best of breed” representative record

To read the series, please follow these links:

To download the associated presentation (no registration required), please follow this link: OCDQ Downloads



“Knowing the characteristics of your customers,” stated Jill Dyché and Evan Levy in the opening chapter of their excellent book, Customer Data Integration: Reaching a Single Version of the Truth, “who they are, where they are, how they interact with your company, and how to support them, can shape every aspect of your company's strategy and operations.  In the information age, there are fewer excuses for ignorance.”

For companies of every size and within every industry, customer incognita is a crippling condition that must be replaced with customer cognizance in order for the company to continue to remain competitive in a rapidly changing marketplace.

Do you know your customers?  If not, then they likely aren't your customers anymore.