Choose Index below for a list of all words and phrases defined in this glossary.

Data Preprocessing

index | Index

Data Preprocessing - definition(s)

data preprocessing - Data preprocessing describes any type of processing performed on raw data to prepare it for another processing procedure. Commonly used as a preliminary data mining practice, data preprocessing transforms the data into a format that will be more easily and effectively processed for the purpose of the user -- for example, in a neural network. There are a number of different tools and methods used for preprocessing, including: sampling, which selects a representative subset from a large population of data; transformation, which manipulates raw data to produce a single input; denoising, which removes noise from data; normalization, which organizes data for more efficient access; and feature extraction, which pulls out specified data that is significant in some particular context.

In a customer relationship management (CRM) context, data preprocessing is a component of Web mining. Web usage logs may be preprocessed to extract meaningful sets of data called user transactions, which consist of groups of URL references. User sessions may be tracked to identify the user, the Web sites requested and their order, and the length of time spent on each one. Once these have been pulled out of the raw data, they yield more useful information that can be put to the user's purposes, such as consumer research, marketing, or personalization.

Related glossary terms: Universal Data Access (UDA), hybrid online analytical processing (HOLAP or Hybrid OLAP), geographic information system (GIS), DSTP (Data Space Transfer Protocol), multidimensional online analytical processing (MOLAP), FileMaker (FMP), meta, data mining, pivot table, data warehouse / database warehouse

[Category=Data Management ]

Source:, 22 July 2013 09:19:53, External

Data Quality Glossary.  A free resource from GRC Data Intelligence. For comments, questions or feedback: