“It is a truth universally acknowledged, that a blogger in possession of a good domain must be in want of some worthwhile comments.”
“The most rewarding thing has been that comments,” explained Hanagarne, “led to me meeting some great people I possibly never would have known otherwise.” I wholeheartedly echo that sentiment.
This is the second entry in my ongoing series celebrating my heroes – my readers.
“This article is intriguing. I would add more still.
A most significant quote: 'Data could be considered a constant while Information is a variable that redefines data for each specific use.'
This tells us that Information draws from a snapshot of a Data store. I would state further that the very Information [specification] is – in itself – a snapshot.
The earlier quote continues: 'Data is not truly a constant since it is constantly changing.'
Similarly, it is a business reality that 'Information is not truly a constant since it is constantly changing.'
The article points out that 'The Data-Information Continuum' implies a many-to-many relationship between the two. This is a sensible CONCEPTUAL model.
Enterprise Architecture is concerned as well with its responsibility for application quality in service to each Business Unit/Initiative.
For example, in the interest of quality design in Application Architecture, an additional LOGICAL model must be maintained between a then-current Information requirement and the particular Data (snapshots) from which it draws. [Snapshot: generally understood as captured and frozen – and uneditable – at a particular point in time.] Simply put, Information Snapshots have a PARENT RELATIONSHIP to the Data Snapshots from which they draw.
Analyzing this further, refer to this further piece of quoted wisdom (from section 'Subjective Information Quality'): '...business units and initiatives must begin defining their Information...by using...Data...as a foundation...necessary for the day-to-day operation of each business unit and initiative.'
From logically-related snapshots of Information to the Data from which it draws, we can see from this quote that yet another PARENT/CHILD relationship exists...that from Business Unit/Initiative Snapshots to the Information Snapshots that implement whatever goals are the order of the day. But days change.
If it is true that 'Data is not truly a constant since it is constantly changing,' and if we can agree that Information is not truly a constant either, then we can agree to take a rational and profitable leap to the truth that neither is a Business Unit/Initiative...since these undergo change as well, though they represent more slowly-changing dimensions.
Enterprises have an increasing responsibility for regulatory/compliance/archival systems that will qualitatively reproduce the ENTIRE snapshot of a particular operational transaction at any given point in time.
Thus, the Enterprise Architecture function has before it a daunting task: to devise a holistic process that can SEAMLESSLY model the correct relationship of snapshots between Data (grandchild), Information (parent) and Business Unit/Initiative (grandparent).
There need be no conversion programs or redundant, throw-away data structures contrived to bridge the present gap. The ability to capture the activities resulting from the undeniable point-in-time hierarchy among these entities is where tremendous opportunities lie.”
“My favorite quote is 'Instead of focusing on the exceptions – focus on the improvements.'
I think that it is really important to define incremental goals for data quality projects and track the progress through percentage improvement over a period of time.
I think it is also important to manage the expectations that the goal is not necessarily to reach 100% (which will be extremely difficult if not impossible) clean data but the goal is to make progress to a point where the purpose for cleaning the data can be achieved in much better way than had the original data been used.
For example, if marketing wanted to use the contact data to create a campaign for those contacts which have a certain ERP system installed on-site. But if the ERP information on the contact database is not clean (it is free text, in some cases it is absent etc...) then any campaign run on this data will reach only X% contacts at best (assuming only X% of contacts have ERP which is clean)...if the data quality project is undertaken to clean this data, one needs to look at progress in terms of % improvement. How many contacts now have their ERP field cleaned and legible compared to when we started etc...and a reasonable goal needs to be set based on how much marketing and IT is willing to invest in these issues (which in turn could be based on ROI of the campaign based on increased outreach).”
Proving that my readers are way smarter than I am, on The General Theory of Data Quality, John O'Gorman commented:
“My theory of the data, information, knowledge continuum is more closely related to the element, compound, protein, structure arc.
In my world, there is no such thing as 'bad' data, just as there is no 'bad' elements. Data is either useful or not: the larger the audience that agrees that a string is representative of something they can use, the more that string will be of value to me.
By dint of its existence in the world of human communication and in keeping with my theory, I can assign every piece of data to one of a fixed number of classes, each with characteristics of their own, just like elements in the periodic table. And, just like the periodic table, those characteristics do not change. The same 109 usable elements in the periodic table are found and are consistent throughout the universe, and our ability to understand that universe is based on that stability.
Information is simply data in a given context, like a molecule of carbon in flour. The carbon retains all of its characteristics but the combination with other elements allows it to partake in a whole class of organic behavior. This is similar to the word 'practical' occurring in a sentence: Jim is a practical person or the letter 'p' in the last two words.
Where the analogue bends a bit is a cause of a lot of information management pain, but can be rectified with a slight change in perspective. Computers (and almost all indexes) have a hard time with homographs: strings that are identical but that mean different things. By creating fixed and persistent categories of data, my model suffers no such pain.
Take the word 'flies' in the following: 'Time flies like an arrow.' and 'Fruit flies like a pear.' The data 'flies' can be permanently assigned to two different places, and their use determines which instance is relevant in the context of the sentence. One instance is a verb, the other a plural noun.
Knowledge, in my opinion, is the ability to recognize, predict and synthesize patterns of information for past, present and future use, and more importantly to effectively communicate those patterns in one or more contexts to one or more audiences.
On one level, the model for information management that I use makes no apparent distinction between the data: we all use nouns, adjectives, verbs and sometimes scalar objects to communicate. We may compress those into extremely compact concepts but they can all be unraveled to get at elemental components. At another level every distinction is made to insure precision.
The difference between information and knowledge is experiential and since experience is an accumulative construct, knowledge can be layered to appeal to common knowledge, special knowledge and unique knowledge.
Common being the most easily taught and widely applied; Special being related to one or more disciplines and/or special functions; and, Unique to individuals who have their own elevated understanding of the world and so have a need for compact and purpose-built semantic structures.
Going back to the analogue, knowledge is equivalent to the creation by certain proteins of cartilage, the use to which that cartilage is put throughout a body, and the specific shape of the cartilage that forms my nose as unique from the one on my wife's face.
To me, the most important part of the model is at the element level. If I can convince a group of people to use a fixed set of elemental categories and to reference those categories when they create information, it's amazing how much tension disappears in the design, creation and deployment of knowledge.”
Tá mé buíoch díot
Daragh O Brien recently taught me the Irish Gaelic phrase Tá mé buíoch díot, which translates as I am grateful to you.
I am very grateful to all of my readers. Since there have been so many commendable comments, please don't be offended if your commendable comment hasn't been featured yet. Please keep on commenting and stay tuned for future entries in the series.