Adaptive K-Means Clustering to Handle Heterogeneous Data Using Basic Rough Set Theory

B. Tripathy; Adhir Ghosh; G. Panda

Advances in Computer Science and Information Technology. Networks and Communications. Second International Conference, CCSIT 2012, Bangalore, India, January 2-4, 2012. Proceedings, Part I

Research Article

Adaptive K-Means Clustering to Handle Heterogeneous Data Using Basic Rough Set Theory

Download

484 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-642-27299-8_21,
    author={B. Tripathy and Adhir Ghosh and G. Panda},
    title={Adaptive K-Means Clustering to Handle Heterogeneous Data Using Basic Rough Set Theory},
    proceedings={Advances in Computer Science and Information Technology. Networks and Communications. Second International Conference, CCSIT 2012, Bangalore, India, January 2-4, 2012. Proceedings, Part I},
    proceedings_a={CCSIT PART I},
    year={2012},
    month={11},
    keywords={Classification Cluster Crisp boundaries Heterogeneous data Uncertainty},
    doi={10.1007/978-3-642-27299-8_21}
}

B. Tripathy
Adhir Ghosh
G. Panda
Year: 2012
Adaptive K-Means Clustering to Handle Heterogeneous Data Using Basic Rough Set Theory
CCSIT PART I
Springer
DOI: 10.1007/978-3-642-27299-8_21

B. Tripathy¹^,*, Adhir Ghosh¹^,*, G. Panda²^,*

1: VIT University
2: MITS

*Contact email: tripathybk@rediffmail.com, adhir39@rediffmail.com, gkpmail@sify.com

Abstract

Several cluster analysis techniques have been developed till the present to group objects having similar property or similar characteristics and K-means clustering is one of the most popular statistical clustering techniques proposed by Macqueen [12] in 1967. But this algorithm is unable to handle the categorical data and unable to handle uncertainty as well. But after proposing the rough set theory by Pawlak [15], we have an alternative way of representing sets whose exact boundary cannot be described due to incomplete information. As rough set has been widely used for knowledge representation, hence it can also be applied in classification and very helpful in clustering too. In real life data mining applications we do not have the crisp boundaries for clusters. So, in 2007 and 2009 Parmar et al [14] and Tripathy et al [16] proposed two algorithms MMR and MMeR using rough set theory but these two algorithms have the stability problem due to multiple runs and higher time complexity. In this paper we are proposing a new approach of k-means algorithm using rough set which can handle heterogeneous data and uncertainty as well.

Keywords: Classification, Cluster, Crisp boundaries, Heterogeneous data, Uncertainty

Published: 2012-11-09

: http://dx.doi.org/10.1007/978-3-642-27299-8_21

Adaptive K-Means Clustering to Handle Heterogeneous Data Using Basic Rough Set Theory

Abstract

About EAI

Community

Publish with EAI