Data Quality Reboot Series For Big Data: Part 3 Risky Data, Risky Business?

Michele Goetz, VP, Principal Analyst

Sep 19 2012

When you last pulled up a chair to this blog we talked about data quality persistence and disposability for big data. The other side of the coin is, should you even do big data quality at all?

So, this blog is dedicated to stepping outside the comfort zone once again and into the world of chaos. Not only may you not want to persist in your data quality transformations, but you may not want to cleanse the data.

Current thinking: Purge poor data from your environment. Put the word “risk” in the same sentence as data quality and watch the hackles go up on data quality professionals. It is like using salt in your coffee instead of sugar. However, the biggest challenge I see many data quality professionals face is getting lost in all the data due to the fact that they need to remove risk to the business caused by bad data. In the world of big data, clearly you are not going to be able to cleanse all that data. A best practice is to identify critical data elements that have the most impact on the business and focus efforts there. Problem solved.

Not so fast. Even scoping the data quality effort may not be the right way to go. The time and effort it takes as well as the accessibility of the data may not meet business needs to get information quickly. The business has decided to take the risk, focusing on direction rather than precision.

Reboot: Don’t worry about bad data. Precision is not always the end game, and the business is balancing risk with reward. Understand the decision process. Decisions are based as much about what the data shows as experience and anecdotal evidence. This trifecta is a balance, and data may be a catalyst or validator, not the only guide. To determine if data cleansing if required, consider time available, deviation of analytic results to perceived or accepted hypothesis, and risk within the context of data use. It may be that data quality really doesn’t matter and the data is good enough.

However, don’t throw away your data quality best practices yet. Data quality measures and indexes created for data governance give you guide posts to build a trust continuum for data that helps determine when and when not to put data quality rules and efforts in place. Continuously profile data sources and the quality of data feeding analysis, not just to correct but to inform on when action is necessary.

Interested in more about the trust continuum? Read Alan Weintraub’s recent report on information governance.

Get The Insights At Work Newsletter

Country*

Yes, I’d like to receive Forrester’s Insights At Work newsletter and receive occasional survey invitations and marketing communications.

Thanks for signing up.

Stay tuned for updates from the Forrester blogs.

Categories

Get The Insights At Work Newsletter

Thanks for signing up.

Don’t Miss Our Global Tech Summits

Attend our Technology & Innovation Summits for breakthrough research, strategies, and best practices to achieve high-performance IT, embrace AI, and fuel growth with emerging technology. Meet with a Forrester analyst to get perspective on your priorities.

ServiceNow And Atlassian: The Rise Of IT Management Platforms

Architects Demonstrate Value By Showcasing Business Outcomes

Get The Insights At Work Newsletter

Thanks for signing up.

Help Us Improve