![]()
Data Quality
back to DataCleaner
What is Data Quality?
The field of data quality has been overseen in the IT-business for a long time. Organizations are beginning to feel the pain from inconsistent and flawed systems and interest is being built to support the employment of more ambitious goals as to the quality of our data. To illustrate the concept, let's imagine an information chain typically centered around the process of building a data warehouse. Even though you can employ data quality principles in lots of other scenarios, the data warehouse is the archetypical situation since the data warehouse seeks to create a single version of the truth - and obviously this truth has to be of high quality!
Consider the information chain illustrated below.
![[image]](http://mowser.com/img?url=http%3A%2F%2Feobjects.org%2Ftrac%2Fraw-attachment%2Fwiki%2FDataQuality%2FDataCleanerInformationChain.png)
Data quality problems can exist at many levels, let's take a couple of examples:
How does this relate to DataCleaner?
The DataCleaner project is a project aimed for working seriously and ambitiously to create a framework for data quality. The goal of DataCleaner in the example above is to help the data warehouse professional to understand the source systems he is working with better and apply the logic of this understanding to both the input and output of his process. This way we ensure our data's quality. Some might even say that we test the data warehouse, just as we test our products for flaws and our software for bugs.
So in short...
DataCleaner is a data quality component, application and monitor for profiling, validating and comparing data


