<?xml version="1.0" encoding="UTF-8"?>
<!--Generated by Squarespace V5 Site Server v5.13.156 (http://www.squarespace.com) on Sun, 19 May 2013 16:51:29 GMT--><?xml-stylesheet type="text/css" href="/universal/styles/feed.css"?><rss version="2.0"><channel><title>OCDQ Blog Feed - Comments</title><link>http://www.ocdqblog.com/home/</link><description>Obsessive-Compulsive Data Quality Blog</description><copyright>Copyright Jim Harris 2009-2011</copyright><language>en-US</language><generator>Squarespace V5 Site Server v5.13.156 (http://www.squarespace.com)</generator><item><title>Nathan Lowenthal comments on The Need for Data Philosophers</title><author>Nathan Lowenthal</author><pubDate>Fri, 17 May 2013 21:49:39 +0000</pubDate><link>http://www.ocdqblog.com/home/the-need-for-data-philosophers.html#comments</link><guid isPermaLink="false">327252:3438475:comment/20025696</guid><description><![CDATA[<p>Insightful article, however, I believe that you seem to miss (as many do) the deeper and more important role a Data Philosopher can play in the Data Sciences. In both the asking of questions [queries] and in the acquisition of both outside answers/ information, and even in the analysis of pre-existing data, understanding and being able to manipulate the various dynamics is both aided and enhanced by a deep understanding of Philosophical Logic. Not the basics that computer programers, engineers, lawyers, and the like study, but rather Philosophical Logic at a much more expansive level.</p><p>In devising entire data-centric systems such Data Philosophers would be tasked with devising the constructs on a logical basis with the type of depth and variability to both capture the &quot;real world&quot; and humanistic aspects, but also allow such systems to attain a malleable quality that can be lacking when dealing with (especially customizable) constructions. Simple surveys and questions could be cross deployed, while the insight which are gleaned can become a cross-disciplinary affair -- a vexing problem for anyone who has dealt with older data, or poorly designed systems when trying to integrate them into larger fabrics.</p><p>The ability to both reach deeply to inquire about the most vexing questions and seek to answer the most abstruse problems would be greatly aided by this thinking. Without the care in logically evaluating, and constructing, our future data systems as such, we are destined -- at some relatively far off point -- going to have to come to grips with systematic inaccuracies in our analysis that will not be easily disembroiled.</p>]]></description></item><item><title>Karen A. Way comments on The Need for Data Philosophers</title><author>Karen A. Way</author><pubDate>Fri, 17 May 2013 12:21:27 +0000</pubDate><link>http://www.ocdqblog.com/home/the-need-for-data-philosophers.html#comments</link><guid isPermaLink="false">327252:3438475:comment/20024178</guid><description><![CDATA[<p>As always Jim, another great post. It&#39;s interesting to note the new nomenclature that is emerging regarding job titles for those of us in the data space.  No matter whether you deal with the hardware/infrastructure, the visualization, the interpretation or the preparation of the data, using the term &quot;data management professional&quot; just doesn&#39;t seem to be enough any longer.  For those of us who have been around for a while, it may be difficult to define ourselves (or what we do) in the more specific niches of data artist, data philosopher, etc.  Yes, this old dog is still working to learn the new tricks that are the result of the ever changing world of data management.  :-)</p>]]></description></item><item><title>Lawrence Serewicz comments on The Need for Data Philosophers</title><author>Lawrence Serewicz</author><pubDate>Thu, 16 May 2013 23:18:33 +0000</pubDate><link>http://www.ocdqblog.com/home/the-need-for-data-philosophers.html#comments</link><guid isPermaLink="false">327252:3438475:comment/20023255</guid><description><![CDATA[<p>Thanks for an interesting post. I am not sure why you distinguish science from philosophy. The ancient view was that philosophy is the origin of science. I would suggest that the idea that science and philosophy are separated is a modern conceit and more indicative of an anti-philosophical position than a scientific position. In particular, philosophy strives to explain the whole while science, at least modern natural science, is reduced to empiricism or simply being a method for solving problems.  One could go so far as to say that modern natural science shorn of philosophy or political philosophy is empty of meaning and becomes a technique rather than a proper method of inquiry.</p><p>I suppose that Heidegger and Husserl have shown the vacuity of modern scientists who can destroy the world but have no scientific way of explaining why the world is to be saved. In this I am reminded of Leo Strauss&#39;s article on Philosophy as a rigorous science: <a href="http://archive.org/stream/LeoStraussphilosophyAsRigorousSciencePoliticalPhilosophy1971/Strauss-PhilosophyAsRigorousScience1971_djvu.txt" rel="nofollow">http://archive.org/stream/LeoStraussphilosophyAsRigorousSciencePoliticalPhilosophy1971/Strauss-PhilosophyAsRigorousScience1971_djvu.txt</a></p><p>The challenge of data or big data is that it pushes us deeper into the cave. We try to make sense of philosophical problems (e.g., &quot;what does it mean?&quot;, &quot;Is it good?&quot;) with technology.  One would be better served by reading Heidegger&#39;s Question concerning technology to see that we need to return to the origins of technology so we can understand how to live with it. Yet, even that simple political philosophical question &quot;What is the best way to live&quot; is not a question that is open to a modern scientific answer unless one fundamentally reconsiders science as philosophy rather than science opposed to philosophy. Until we can begin to overcome that false divide there will be no data philosophers properly understood.</p><p>Thanks for a stimulating post.</p>]]></description></item><item><title>Mike Urbonas comments on The Laugh-In Effect of Big Data</title><author>Mike Urbonas</author><pubDate>Tue, 30 Apr 2013 12:49:25 +0000</pubDate><link>http://www.ocdqblog.com/home/the-laugh-in-effect-of-big-data.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19982504</guid><description><![CDATA[<p>Nice post, Jim. In addition to Arte Johnson’s &quot;Very interesting . . . but stupid!&quot; feel free to throw in Jo Anne Worley’s &quot;BORRRRRRRRING!&quot; from time to time :)</p>]]></description></item><item><title>Jim Harris comments on The Laugh-In Effect of Big Data</title><author>Jim Harris</author><pubDate>Tue, 23 Apr 2013 23:26:57 +0000</pubDate><link>http://www.ocdqblog.com/home/the-laugh-in-effect-of-big-data.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19966825</guid><description><![CDATA[<p>From the <a href="http://lnkd.in/XVz79a" title="http://lnkd.in/XVz79a" rel="nofollow">LinkedIn Group for Big Data / Analytics / FP&amp;A / S&amp;OP / Strategic Planning / Predictive &amp; Business Analytics</a>, <b>Tom Deutsch</b> commented: </p><p>“Hi Jim - going to provide an alternative POV to your assertion. The technologies offer significant new flexibility in handling data processing and availability issues that are recurring / stubborn / hard to address with conventional approaches. The issue isn&#39;t if the examples provide have &quot;applicability to your specific business&quot;, it is the architecture patterns that matter. Similar to misguided approaches to selecting technologies that only focus on counting the number of TBs, being focused on the application examples rather than what allowed the apps to work is a mistake.”</p><p><br/><b>And I responded:</b> </p><p>Thanks for your comment, Tom.</p><p>I appreciate, and agree with, your counterpoint to my assertion. The architecture patterns that allow the applications of data science and big data to work is essential to focus on.</p><p>However, one example of my concern is the use of big data and data science in the last two U.S. presidential elections. (By the way, I highly recommend to anyone interested in this topic <a href="http://www.amazon.com/Victory-Lab-Science-Winning-Campaigns/dp/030795479X" title="amazon.com/Victory-Lab-Science-Winning-Campaigns/dp/030795479X" rel="nofollow"><em>The Victory Lab: The Secret Science of Winning Campaigns</em></a> an excellent book written by Sasha Issenberg).</p><p>A U.S. presidential election is a specialized case with limited applicability to businesses, since it is like selling a product that can only be purchased on one day every four years, and not only that, but only has essentially one competing product to consider (i.e., vote for the Democrat or the Republican).</p><p>Furthermore, using data science and big data to target voters is not the same as targeting consumers. Voters are <b><i>slightly</i></b> more open to what are essentially aggressive marketing messages because voting is often seen as a civic duty and a way to add your voice to democracy. Consumers are far less open to aggressive marketing messages because no matter how popular a consumer product is (e.g., iPhones and iPads), it is not a civic duty or expression of democracy to purchase one.</p><p>Therefore, the use of data science and big data in elections is important to understand as a voter and a citizen (as well as a data privacy advocate), but it doesn&#39;t really help the average organization understand how to apply big data to their business activities, so from my perspective it is not a compelling business case.</p><p>Best Regards,</p><p>Jim</p><p><br/>And <b>Tom Deutsch</b> responded: </p><p>“Hi Jim - thoughtful discussion, which is great. In the spirit of cooperative disagreement (and it is intended to be cooperative) the election use case above is just a segmentation use case. And that is totally applicable for any business that deals with customers (or suppliers), which covers a lot of ground. The mind set of using all the available data to predict behavior and target those inclined to cooperate (purchase or vote) with you is really what the use case is focused on. It just happens to be expressed in a political use case.”</p><p><br/><b>And I responded:</b> <br/> <br/>Hi Tom, thanks for continuing the discussion. I am always open to disagreement, cooperative or otherwise :-)</p><p>Yes, the election case is a segmentation use case. But not all segmentation (or other types of) use cases are created equal, nor can they all be extrapolated to other applications.</p><p>Predicting behavior is foundational to understanding both voters and consumers, but what works for one doesn&#39;t necessarily work for the other. Perhaps I am quibbling over trifle distinctions, but political behavior (at least in the U.S.) is very polarized (as well as often oversimplified to Democrat vs. Republican or Conservative vs. Liberal) whereas consumer behavior is more diversified, which, in a certain sense, makes political behavior more predictable than consumer behavior.</p><p><br/>And <b>Tom Deutsch</b> responded: </p><p>“Hi Jim - I prefer cooperative, especially give the spirit of our discussion here! </p><p>So while understanding your point (and not losing sight of the article being a well placed reminder not to get carried away in the big data hype) the techniques utilized aren&#39;t limited to sizing up voters. The data may be specific but the techniques are not, and that makes it applicable to a whole range of organizations.” </p><p><br/><b>And I responded:</b> <br/> <br/>Hi Tom, yes cooperative disagreement is always best. In fact, in academic settings, there is a similar concept known as “adversarial collaboration” where two researchers with polarized viewpoints co-write a paper together to provide a more comprehensive coverage of a topic. </p><p>I definitely agree with your point about similar techniques, but different data. The techniques — the methodology of true data science — is something that is applicable, as you said, to a whole range of organizations, as well as applications. </p><p>My only real fear is that distinction seems too often to get lost in the hype of big data.</p>]]></description></item><item><title>David Jaques-Watson comments on When Poor Data Quality Kills</title><author>David Jaques-Watson</author><pubDate>Mon, 22 Apr 2013 03:24:19 +0000</pubDate><link>http://www.ocdqblog.com/home/when-poor-data-quality-kills.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19948286</guid><description><![CDATA[<p>I remember an article from a few years ago on the CMM model, and a NASA programmer saying words to the effect of: &quot;Code reviews take on a different emphasis when there&#39;s an astronaut sitting across the table from you, saying &#39;Prove to me that this change won&#39;t <strong><i>kill</i></strong> me when I&#39;m up there in orbit&#39;. It puts a whole new emphasis on quality.&quot;</p><p>It&#39;s always the (literal) life-and-death situations where you want 100% quality - data quality, program quality, procedural quality. This is why initiatives such as a national health record - although a worthy ambition - are looked on with suspicion by medicos. Even in hospital, the good staffers will always ask you your name, date of birth, and why you&#39;re in there before they do anything. They don&#39;t even trust the chart hanging on the end of your bed! - not until they&#39;ve cross-checked with you, the patient, to ensure the wrong chart hasn&#39;t been put there.</p>]]></description></item><item><title>Jim Harris comments on Expectation and Data Quality</title><author>Jim Harris</author><pubDate>Sat, 13 Apr 2013 14:51:34 +0000</pubDate><link>http://www.ocdqblog.com/home/expectation-and-data-quality.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19928384</guid><description><![CDATA[<p>Via <a href="http://www.information-management.com/blogs/can-expectations-alter-data-quality-10024101-1.html" title="information-management.com/blogs/can-expectations-alter-data-quality-10024101-1.html" rel="nofollow">Information Management</a>, <b>Alan David Duncan</b> commented: </p><p>“I&#39;m not sure about this. It would be nice to think that we could &quot;spin&quot; things and promote a positive experience (&quot;Hey! Why not come and try our new reports, they taste great!&quot;). But I&#39;ve found that the question isn&#39;t so much about expectations, as about visibility - people don&#39;t know what they can trust (because there&#39;s no lineage, no reconciliation, no traceability etc etc), so they end up with a default position of not trusting anything.</p><p>We had an illustrative case here at the University last week, where one of the Heads of School asked for a report on how many students were due to commence in first year. Two different reports gave two different answers (materially different numbers), but there was no narrative to explain where the numbers had come from, what calculations had been applied, what filters were used to constrain the query, etc.</p><p>It turned out when we did the forensic analysis that there were NINE different underlying issues that were contributing to the difference in reported numbers! No visibility = no credibility = no trust = no value.”</p><p><br/><b>And I responded:</b></p><p>Thanks for your comment, Alan.</p><p>I agree with you that trying to create an a priori (i.e., not dependent on experience) positive expectation about data is improbable, but I would say that you provided another excellent example of how a negative expectation about data was created.</p><p>In your case, I assume that the Head of School who requested data, and then received two conflicting reports, probably now has a negative expectation about student data managed by the University.</p><p>So now, you have the even more difficult task of altering an a posteriori (i.e., dependent on experience) negative expectation about data -- in other words, how are you going to convince that Head of School that they should expect a positive experience with data next time?</p><p>Best Regards,</p><p>Jim</p>]]></description></item><item><title>Jim Harris comments on When Poor Data Quality Kills</title><author>Jim Harris</author><pubDate>Sat, 13 Apr 2013 14:38:30 +0000</pubDate><link>http://www.ocdqblog.com/home/when-poor-data-quality-kills.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19928355</guid><description><![CDATA[<p>Thanks for your comments Dylan and Melinda. </p><p><b><i>@Dylan</i></b> — Great point about these types of risks increasing as we move further into a data-centric society. </p><p><b><i>@Melinda</i></b> — Excellent point about the quality of machine-generated data raising different kinds of issues. For sensor data, poor quality could be an indication of a poorly calibrated sensor. If that sensor is monitoring oxygen levels on a spacecraft carrying humans, for example, on the journey to Mars, then poor data quality could make the astronauts dead on arrival.</p>]]></description></item><item><title>Melinda Thielbar comments on When Poor Data Quality Kills</title><author>Melinda Thielbar</author><pubDate>Thu, 11 Apr 2013 15:07:11 +0000</pubDate><link>http://www.ocdqblog.com/home/when-poor-data-quality-kills.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19921752</guid><description><![CDATA[<p>I think that you and Dylan are both showing examples where the consequences of a mistake determine the need for high data quality. Wartime, medical files—these are cases where lives depend on making the right choice. </p><p>For fraud detection, we had to worry about putting people who really couldn&#39;t afford a legal battle through a costly investigation—not to mention wasting precious resource. </p><p>For the military and spacecraft applications I&#39;m working on now, data quality is also very important, though because it&#39;s machines generating the data, you have to worry about different kinds of issues.</p>]]></description></item><item><title>Dylan Jones comments on When Poor Data Quality Kills</title><author>Dylan Jones</author><pubDate>Tue, 09 Apr 2013 19:24:36 +0000</pubDate><link>http://www.ocdqblog.com/home/when-poor-data-quality-kills.html#comments</link><guid isPermaLink="false">327252:3438475:comment/19916635</guid><description><![CDATA[<p>Great post Jim, </p><p>I&#39;ve experienced this personally when my son&#39;s medical data was mixed up with another child in a separate part of the country. We started receiving his medical appointments because the patient ID&#39;s had been duplicated. Doesn&#39;t take a data quality guru to imagine the implications if physicians are looking at the wrong medical history.</p><p>Another example where accurate data is essential is in war-time, I remember the US bombing of the Chinese embassy due to out-of-date information.</p><p>Of course there are thousands of other instances where safety can be compromised through bad data, as we move further into a data-centric society the risks increase.</p><p>Great post as ever.</p><p>Dylan Jones</p><p>Community Manager<br/><a href="http://dataqualitypro.com/" rel="nofollow">Data Quality Pro.com</a></p>]]></description></item></channel></rss>