The recent analysis by Andy Bitterer of Gartner Research (and ANALYSTerical) about the acquisition of open source data quality tool DataCleaner by the enterprise data quality vendor Human Inference, prompted the following Twitter conversation:
Since enterprise data quality tools can be cost-prohibitive, more prospective customers are exploring free and/or open source alternatives, such as the Talend Open Profiler, licensed under the open source General Public License, or non-open source, but entirely free alternatives, such as the Ataccama DQ Analyzer. And, as Andy noted in his analysis, both of these tools offer an easy transition to the vendors’ full-fledged commercial data quality tools, offering more than just data profiling functionality.
As Henrik Liliendahl Sørensen explained, in his blog post Data Quality Tools Revealed, data profiling is the technically easiest part of data quality, which explains the tool diversity, and early adoption of free and/or open source alternatives.
And there are also other non-open source alternatives that are more affordable than enterprise data quality tools, such as Datamartist, which combines data profiling and data migration capabilities into an easy-to-use desktop application.
My point is neither to discourage the purchase of enterprise data quality tools, nor promote their alternatives—and this blog post is certainly not an endorsement—paid or otherwise—of the alternative data quality tools I have mentioned simply as examples.
My point is that many new technology innovations originate from small entrepreneurial ventures, which tend to be specialists with a narrow focus that can provide a great source of rapid innovation. This is in contrast to the data management industry trend of innovation via acquisition and consolidation, embedding data quality technology within data management platforms, which also provide data integration and master data management (MDM) functionality as well, allowing the mega-vendors to offer end-to-end solutions and the convenience of one-vendor information technology shopping.
However, most software licenses for these enterprise data management platforms start in the six figures. On top of the licensing, you have to add the annual maintenance fees, which are usually in the five figures. Add to the total cost of the solution, the professional services that are needed for training and consulting for installation, configuration, application development, testing, and production implementation—and you have another six figure annual investment.
Debates about free and/or open source software usually focus on the robustness of functionality and the intellectual property of source code. However, from my perspective, I think that the real reason more prospective customers are exploring these alternatives to enterprise data quality tools is because of the free aspect—but not because of the open source aspect.
In other words—and once again I am only using it as an example—I might download Talend Open Profiler because I wanted data profiling functionality at an affordable price—but not because I wanted the opportunity to customize its source code.
I believe the “try it before you buy it” aspect of free and/or open source software is what’s important to prospective customers.
Therefore, enterprise data quality vendors, instead of acquiring an open source tool as Human Inference did with DataCleaner, how about offering a free (with limited functionality) or trial version of your enterprise data quality tool as an alternative option?