The role of data cleanliness in commercialization

With the commercialization of data quickly becoming a viable source of revenue for organisations, data quality has become a key consideration for savvy CDOs (Chief Data Officers) We know that data quality is essential for internal utilisation of data. According to IDG’s 2015 Big Data and Analytics survey 39% of organisations believe data quality is necessary to gaining value from data, so it’s probably not a stretch to hypothesise that it’s also important for other organisations looking to purchase data & put it to use. To further explore the link between data quality and commercial viability of data, I’d like to call on the metaphor of selling a car – would a dirty car fetch a lower price? Or sell at all? And finally, if you want to clean a car, should you do it yourself or call in a professional?

No one wants to clean their new car

It’s not hard to imagine that a dirty car would sell for less than a clean car. Firstly, if the person buying the car wants to use it straight away, they probably would hate to have to clean it first. Data Analysts spend a huge amount of time cleaning up data to be analytics ready. In fact, Data Scientists reported spending hours every day simply cleaning up and transforming data just to do their analysis. So it would be reasonable to assume that data analysts would pay more for data that lets them get started on analysis sooner rather than later.

Dust can cover up scratches

Another key consideration when buying a car is understanding existing damage and roadworthiness. If the bonnet is covered in bird poop, how does the buyer know that there aren’t scratches underneath? If the underside of the car is covered in mud, how would the buyer spot rust? It wouldn’t be unreasonable for a car customer to walk away from the purchase of a dirty car simply because they couldn’t make an informed decision over its condition when clean. Similarly, a huge challenge for data scientists is understanding if data is appropriate for the problem they’re trying to solve. When an analyst reviews sample data; they need to be able to  confidently make a purchasing decision. Data too dirty to understand may be rejected outright.

Just get the professionals to do it!

There’s another reason not to sell a dirty car – car washes are everywhere! And assuming that the sale price of a clean car is likely to be much higher than a dirty car; it’s not uncommon for people to have a car washed & detailed before a sale. Similarly, there are a multitude of services available to clean up your data quality. Services range from Information Management consultancies, who tend to provide broader, strategically focused methods to increase organizational data quality (which may benefit both internal and external data users); to automated services from a variety of vendors; and finally to data analysts, who can provide a finer-grained clean on a dataset-by-dataset basis (although making this type of cleansing repeatable can be more of a challenge).

A final detour

Unlike selling a car, organisations can derive recurring revenue from valuable, up-to-date and clean datasets.  You don’t just sell a dataset once, so each and every one of your customers stands to benefit from a sustained data quality programme.