Over the last two decades, businesses have been creating and consuming data in exponential quantities due to advances in technology. The data management industry has emerged from this data explosion and given birth to a wide variety of jobs, ranging from the Data Scientist to the Chief Data Officer. Data, however, is generally of little practical use to the business unless it is of a high enough quality.
What is Data Quality and Why Does it Matter?
Data quality is about ensuring that data is fit for a purpose and that it is accurate, timely, and complete enough to be used in the way it is intended. Good data quality ensures that the information about your customers is as complete and as accurate as it can be. In short, good data is your most valuable asset. Conversely, poor data quality can adversely affect the success of projects and extend both costs and duration. According to a Gartner survey, poor data quality costs organizations an average of $14.2 million annually. Additionally, bad data can seriously impact your credibility and affect customer confidence.
Emerging Data Quality Trends
Given the problems inherent in organizations’ data quality, the following are some key trends that companies are using to effectively solve data quality problems:
More Roles Participate in Improving Data Quality
Data Quality initiatives have traditionally been driven by IT. Recently, however, the business has taken a more active role in managing the goals, rules, processes, and metrics associated with improving data quality. Business functions have begun to establish roles such as Data Stewards, Chief Data Officers, and Information Governance Teams as they recognize the importance of data quality, and in order to deliver on these stewardship-oriented activities, organizations have identified a need to create structured processes for detecting, tracking, and correcting data quality issues. Similarly, vendors are beginning to offer solutions with self-service capabilities as the balance shifts toward data quality roles within the business. These solutions are being tailored to business workers, who must be able to understand and manage the data quality capabilities in order to have an impact.
Big Data and Information Trust
Information trust issues risk serious damage to an organization’s reputation. With the proliferation of big data projects, data quality and information trust challenges are squarely in the public eye. According to Mark Smith of Ventana Research, most of the time spent on big data projects relates to data quality and data preparation. An alternative to the highly promoted data lake approach is gaining ground, referred to as the “data reservoir approach”. According to Gartner analysts Merv Adrian and Nick Heudecker, a data lake aims to gather data in a big data environment without further preparation and cleansing work, while a data reservoir aims to focus on making it more consumption-ready for a wider audience, and not only for a limited number of highly-skilled data scientists. Under that vision, data quality becomes a building block of big data initiatives, rather than a separate discipline.