Algorithms in Data Mining
In an effort to identify some of the most influential algorithms that
have been widely used in the data mining community, the IEEE International
Conference on Data Mining (ICDM) identified the top 10 algorithms
in data mining for presentation at ICDM '06 in Hong Kong.
- [April 22, 2009:]
A companion book
The Top Ten Algorithms in Data Mining published in April 2009
- [December 24, 2007:]
A companion article in PDF
for this top-10 algorithm initiative:
Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang
Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu,
Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand and Dan
Steinberg, Top 10 Algorithms in Data Mining, Knowledge and
Information Systems, 14(2008), 1: 1-37.
As the first step in the identification process, in September 2006 we
invited the ACM KDD Innovation Award and IEEE ICDM Research
Contributions Award winners to each nominate up to 10 best-known
algorithms in data mining. All except one in this distinguished set of
award winners responded to our invitation. We asked each nomination
to provide the following information: (a) the algorithm name, (b) a
brief justification, and (c) a representative publication reference.
We also advised that each nominated algorithm should have been widely
cited and used by other researchers in the field, and the nominations
from each nominator as a group should have a reasonable representation
of the different areas in data mining.
After the nominations in Step 1, we verified each nomination for its
citations on Google Scholar in late October 2006, and removed those
nominations that did not have at least 50 citations. All remaining
(18) nominations are given on the candidate list below, organized in
10 topics. Please note that for some of these algorithms such as
K-means, the citation is not given on the original paper that
introduced the algorithm, but a recent paper that highlights the
importance of the technique.
In the third step of the identification process, we had a wider
involvement of the research community. We invited the Program
Committee members of KDD-06, ICDM '06, and SDM '06 as well as the ACM
KDD Innovation Award and IEEE ICDM Research Contributions Award
winners to each vote for up to 10 well-known algorithms from the
above candidate list. The voting results of this step were presented
at ICDM '06 and are given in the slides below.
We hope the identification of the top 10 algorithms can promote data
mining to wider real-world applications and inspire more researchers
in data mining to further explore these 10 algorithms, including their
impact and new research issues.
Xindong Wu and Vipin Kumar
December 25, 2006
This page has been accessed times since October 31, 2006.