Research Article
Spam Source Clustering by Constructing Spammer Network with Correlation Measure
@INPROCEEDINGS{10.1007/978-3-642-02466-5_88, author={Jeongkyu Shin and Seunghwan Kim}, title={Spam Source Clustering by Constructing Spammer Network with Correlation Measure}, proceedings={Complex Sciences. First International Conference, Complex 2009, Shanghai, China, February 23-25, 2009. Revised Papers, Part 1}, proceedings_a={COMPLEX PART 1}, year={2012}, month={5}, keywords={Electronic spam complex network clustering method}, doi={10.1007/978-3-642-02466-5_88} }
- Jeongkyu Shin
Seunghwan Kim
Year: 2012
Spam Source Clustering by Constructing Spammer Network with Correlation Measure
COMPLEX PART 1
Springer
DOI: 10.1007/978-3-642-02466-5_88
Abstract
Spam filtering is one of the most challenging problems in electric message systems. In general, recent studies on specifying real spam source are based on content filtering because spammers usually falsify their origin. We propose a method to specify spam source based on structural analysis with complex network. We assume that each spam sources either has the same victim list or uses the same spam-hosting program. We treat spam source - target relationship as a bipartite network and construct weighted spam source network by network projection using correlation measure. We find that community clustering methods are inappropriate with spammer network. We group spammers with gradient-based grouping, which uses correlations between nodes as gradient between nodes. We convert them into local minima, which helps to cluster spammers into a few spam source groups. We investigate the weblog spam data with the proposed method and validate it. The method that we propose can be applied to diverse categorization problems, such as multiple text categorization and network subunit clustering.