Choose Index below for a list of all words and phrases defined in this glossary.

Noisy Data

index | Index

Noisy Data - definition(s)

noisy data - Noisy data is meaningless data. The term has often been used as a synonym for corrupt data. However, its meaning has expanded to include any data that cannot be understood and interpreted correctly by machines, such as unstructured text. Any data that has been received, stored, or changed in such a manner that it cannot be read or used by the program that originally created it can be described as noisy.

Noisy data unnecessarily increases the amount of storage space required and can also adversely affect the results of any data mining analysis. Statistical analysis can use information gleaned from historical data to weed out noisy data and facilitate data mining.

Noisy data can be caused by hardware failures, programming errors and gibberish input from speech or optical character recognition (OCR) programs. Spelling errors, industry abbreviations and slang can also impede machine reading.

See also: predictive analytics, business analytics, GIGO, machine-to-machine

Related glossary terms: law of large numbers, correlation, big data analytics, data-driven decision management (DDDM), data science, in-memory analytics, association rules (in data mining), ad hoc analysis, business analytics, unstructured data

[Category=Data Management ]

Source:, 27 August 2013 09:42:02, External

Data Quality Glossary.  A free resource from GRC Data Intelligence. For comments, questions or feedback: