Research Article
PENS: an algorithm for density-based clustering in peer-to-peer systems
@INPROCEEDINGS{10.1145/1146847.1146886, author={Mei Li and Wang-Chien Lee and Anand Sivasubramaniam and Guanling Lee}, title={PENS: an algorithm for density-based clustering in peer-to-peer systems}, proceedings={1st International ICST Conference on Scalable Information Systems}, publisher={ACM}, proceedings_a={INFOSCALE}, year={2006}, month={6}, keywords={}, doi={10.1145/1146847.1146886} }
- Mei Li
Wang-Chien Lee
Anand Sivasubramaniam
Guanling Lee
Year: 2006
PENS: an algorithm for density-based clustering in peer-to-peer systems
INFOSCALE
ACM
DOI: 10.1145/1146847.1146886
Abstract
Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a wide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.