This overview provides a description of some of the most common data mining algorithms in use today.

We have broken the discussion into two sections, each with a specific theme:

1-Classical Techniques: Statistics, Neighborhoods and Clustering

2-Next Generation Techniques: Trees, Networks and Rules

Overall, six broad classes of data mining algorithms are covered. Although there are a number of other algorithms and many variations of the techniques described, one of the algorithms from this group of six is almost always used in real world deployments of data mining systems.

These techniques can be used for either discovering new information within large databases or for building predictive models.

The first reason is that the classical data mining techniques such as CART, neural networks and nearest neighbor techniques tend to be more robust to both messier real world data and also more robust to being used by less expert users.

The other reason is that the time is right. Because of the use of computers for closed loop business data storage and generation there now exists large quantities of data that is available to users. IF there were no data - there would be no interest in mining it. Likewise the fact that computer hardware has dramatically upped the ante by several orders of magnitude in storing and processing the data makes some of the most powerful data mining techniques feasible today.

this section contains descriptions of techniques that have classically been used for decades