Jim Harris

My name is Jim Harris, I am the Blogger-in-Chief of OCDQ Blog, and an independent consultant, speaker, and freelance writer for hire.

My Services Contact Me
Search OCDQ Blog
Recent Comments
« Quality Starts and Data Quality | Main | The Data Governance Imperative »
Thursday
Apr052012

Will Big Data be Blinded by Data Science?

All of the hype about Big Data is also causing quite the hullabaloo about hiring Data Scientists in order to help your organization derive business value from big data analytics.  But even though we are still in the hype and hullabaloo stages, these unrelenting trends are starting to rightfully draw the attention of businesses of all sizes.  After all, the key word in big data isn’t big, because, in our increasing data-constructed world, big data is no longer just for big companies and high-tech firms.

And since the key word in data scientist isn’t data, in this post I want to focus on the second word in today’s hottest job title.

When I think of a scientist of any kind, I immediately think of the scientific method, which has been the standard operating procedure of scientific discovery since the 17th century.  First, you define a question, gather some initial data, and form a hypothesis, which is some idea about how to answer your question.  Next, you perform an experiment to test the hypothesis, during which more data is collected.  Then, you analyze the experimental data and evaluate your results.  Whether or not the experiment confirmed or contradicted your hypothesis, you do the same thing — repeat the experiment.  Because a hypothesis can only be promoted to a theory after repeated experimentation (including by others) consistently produces the same result.

During experimentation, failure happens just as often as, if not more often than, success.  However, both failure and success have long played an important role in scientific discovery because progress in either direction is still progress.

Therefore, experimentation is an essential component of scientific discovery — and data science is certainly no exception.

“Designed experiments,” Melinda Thielbar recently blogged, “is where we’ll make our next big leap for data science.”  I agree, but with the notable exception of A/B testing in marketing, most business activities generally don’t embrace data experimentation.

“The purpose of science,” Tom Redman recently explained, “is to discover fundamental truths about the universe.  But we don’t run our businesses to discover fundamental truths.  We run our businesses to serve a customer, gain marketplace advantage, or make money.”  In other words, the commercial application of science has more to do with commerce than it does with science.

One example of the challenges inherent in the commercial application of science is the misconception that predictive analytics can predict what is going to happen with certainty.  When instead, what it actually does is predict some of the possible things that could happen with a certain probability.  Although predictive analytics can be a valuable tool for many business activities, especially decision making, as Steve Miller recently blogged, most of us are not good at using probabilities to make decisions.

So, with apologies to Thomas Dolby, I can’t help but wonder, will big data be blinded by data science?  Will the business leaders being told to hire data scientists to derive business value from big data analytics be blind to what data science tries to show them?

 

This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.

 

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (4)

Jim,

Your concern is well-founded. Knowing how few businesses make really good use of the small data they’ve had around all along, it’s easy to imagine that they won’t do any better with bigger data sets.

I wrote some hints for those wallowing into the big data mire in my post, Better than Brute Force: Big Data Analytics Tips. But the truth is that many organizations won’t take advantage of the ideas that you are presenting, or my tips, especially as the datasets grow larger. That’s partly because they have no history in scientific methods, and partly because the data science movement is driving employers to search for individuals with heroically large skill sets. Since few, if any, people truly meet these expectations, those hired will have real human limitations, and most often they will be people who know much more about data storage and manipulation than data analysis and applications.

April 5, 2012 | Unregistered CommenterMeta Brown

From the TDWI Business Intelligence and Data Warehousing LinkedIn Group, Jeff Ridings commented:

“The business leaders will need to brush up on their understanding of statistics and must ask the data scientists specific business questions from which they derive their decision making processes and refine those to reach intelligent decisions.

In other words, the business needs to set the strategic direction of the data scientist. Keep them pointed in a direction that makes business sense.”

And Vickie Comrie responded:

“Jeff, when you say business leaders, would you rather say business intelligence analysts? Because the job of liaison between the business leader (C-Level Executives) and the data scientist is of the BI business analyst. Business leaders probably have neither the time, inclination or ability to learn statistics at that point in their career. Just a thought.”

And Rhonda Bradford responded:

“Vickie, I agree. Most business leaders look to their analysts for interpretation of the statistics stuff - whether those analysts are called BI analysts, or whether they happen to be staffers in the finance group or elsewhere.

Business leaders are more interested in how they and their organization are performing, rather than in statistical or quantitative techniques. There may well be a trend to hire more data scientists and these people need to understand how to present their output in terms of performance and achievement of business objectives. I have personally seen an analytics group who couldn't do this and they were effectively sidelined by other business analysts who had both the technical understanding and the ability to present the results in business speak.

Most business leaders don't have the time to re-learn things they left behind many years ago (even if they originally had the skills) - they hire people to do that for them.”

April 12, 2012 | Registered CommenterJim Harris

Great article, Jim. The comparison between scientific inquiry and business decision making is a very interesting and important one. This quote from Tom Redman is noteworthy:

“The purpose of science is to discover fundamental truths about the universe. But we don’t run our businesses to discover fundamental truths. We run our businesses to serve a customer, gain marketplace advantage, or make money.”

Successfully serving a customer and boosting competitiveness and revenue does require some (hopefully unique) insights into customer needs. Where do those insights come from? Additionally, scientists also never stop questioning and improving upon “fundamental truths” which I also interpret as not accepting conventional wisdom - obviously an important trait of business managers.

I recently read commentary that gave high praise to the manager utilizing the scientific method in his or her decision-making process. The author was not a technologist, but rather none other than Peter Drucker, in writings from decades ago.

I blogged about Peter Drucker's commentary, data science, the scientific method vs. business decision making, and I'd value your and others' input: Business Managers Can Learn a Lot from Data Scientists.

Barry Devlin also kindly shared some thought-provoking comments regarding concerns over repurposing (big) data in ways that may not, on closer review, be applicable to that data.

April 12, 2012 | Unregistered CommenterMike Urbonas

I have the same mind set as Jim. First of all, I like the Big Data and Data Scientist philosophy, as there is always a probabilistic approach needed to identify, discover, or arrive at a model when there is a lot with uncertainty.

But one very important thing we all know is the fundamental definition of Science - 'Man Applied to nature'. A person discovers or concludes a certain theory based on certain inbound natural process which is not probabilistic. To be simpler, think about a gravitational force, its not market based but nature based then we derive the amount or magnitude of potential or kinetic energy.

So the business data is not natural data generated from a nature, its ecosystem is the business itself which performed it and the prevailing market condition. Unlike consistent gravitational force in nature, it's highly variable and that means always we need to dig into the data and find out what went wrong this time with the business compared to the previous same business attribute, then do the analysis so that the prediction or predictive analytics on the past data and the present data for the same context can find the answer or analyze that we should obey this rule move forward in the business.

But it does not stop there. To consolidate our rule of business for future application, the market condition or the existing condition may change. No core natural process is constant here. Again we need to dig the data and conclude for upcoming future for business. So finally we need to apply the process every year or every time sensitive data to keep on analyzing. This process will be asymptotic.

So in business we have to continue doing this every period of interest the big data or data analysis. Even if we forecast, the forecast process has to be done every time. This is not a typical science, which describes or makes a property of an entity rather the scientific process can be applied in order to arrive at a business decision for that period. We cannot provide a consistent property to an business entity/metrics of interest which are analytic.

April 17, 2012 | Unregistered CommenterDebashis Ghosh

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>