Classification is a classic
data mining task, with roots in
machine learning. A typical application is : "Given past records
of customers who switched to another supplier, predict which
current customers are likely to do the same."
This specific application is known as Churn Prediction, but
there are very many other applications such as predicting response to
a direct marketing campaign, seperating good products from faulty ones etc.
The "Classification Problem" involves data which is divided into
two or more groups, or classes. In our example above, the
two classes are "switched supplier" and "didn't switch".
The data mining software is asked to
tell us which of the groups a new example falls into.
So, we might train the software using customer records from the last year,
divided into our two groups. We then ask the software to predict
which of our customers we're likely to lose.
Of course, to ensure we can trust the predictions, there is generally
a testing or validation stage as well.
See also:
Data Mining Software