Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd ACM International Conference on Knowledge Discovery and Data Mining (KDD). 226–231. A density-based algorithm for discovering clusters in large spatial databases with noise | Proceedings of the Second International Conference on Knowledge Discovery and Data Mining PDF This is the original paper on the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm presented at the 1996 KDD conference. The main points are:.
- features and benefits of DBSCAN: 1.
- Density-based clustering method allows discovery of clusters of arbitrary shape
- Noise (outliers) can be detected naturally
- Fewer input parameters (only two: Eps, MinPts) and relatively easy parameter selection
- Efficient for large spatial databases
- Algorithm Mechanism: 1.
- [Introduced the concepts of density-reachability and density-connectivity.
- Define three types of points: core points, border points, and noise points
- Efficiently search for neighboring points using R*-tree and other spatial indexes
- [Introduced the concepts of density-reachability and density-connectivity.
- performance evaluation
- Comparison with CLARANS, the leading clustering method at the time
- Excellent accuracy in finding clusters of arbitrary shape
- Execution time is more than 100 times faster than CLARANS
- Parameter setting: 1.
- MinPts is usually set to 4 (for 2D data)
- Eps are determined interactively using a graph called k-dist plot
This paper was an important foundational paper for DBSCAN and had a major impact on the field of density-based clustering; it received the SIGKDD Test of Time Award in 2014.
This page is auto-translated from /nishio/DBSCAN using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.