Data cleansing,
data cleaning or
data scrubbing is the process of detecting and correcting (or removing) corrupt or inaccurate
records from a record set,
table, or
database. Used mainly in databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant, etc. parts of the data and then replacing, modifying, or deleting this
dirty data or coarse data. Data cleansing may be performed
interactively with
data wrangling tools, or as
batch processing through
scripting.