An Ensemble-based Approach to Fast Classification of Multi-label Data Streams

Xiangnan Kong; Philip Yu

7th International Conference on Collaborative Computing: Networking, Applications and Worksharing

Research Article

An Ensemble-based Approach to Fast Classification of Multi-label Data Streams

Download1239 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.4108/icst.collaboratecom.2011.247086,
    author={Xiangnan Kong and Philip Yu},
    title={An Ensemble-based Approach to Fast Classification of Multi-label Data Streams},
    proceedings={7th International Conference on Collaborative Computing: Networking, Applications and Worksharing},
    publisher={IEEE},
    proceedings_a={COLLABORATECOM},
    year={2012},
    month={4},
    keywords={data stream data mining multi-label classification random tree},
    doi={10.4108/icst.collaboratecom.2011.247086}
}

Xiangnan Kong
Philip Yu
Year: 2012
An Ensemble-based Approach to Fast Classification of Multi-label Data Streams
COLLABORATECOM
ICST
DOI: 10.4108/icst.collaboratecom.2011.247086

Xiangnan Kong¹, Philip Yu¹^,*

1: University of Illinois at Chicago

*Contact email: psyu@cs.uic.edu

Abstract

Network operators are continuously confronted with online events, such as online messages, blog updates, etc. Due to the huge volume of these events and the fast changes of the topics, it is critical to manage them promptly and effectively. There have been many softwares and algorithms developed to conduct automatic classification over these stream data. Conventional approaches focus on single-label scenarios, where each event can only be tagged with one label. However, in many stream data, each event can be tagged with more than one labels. Effective stream classification systems should be able to consider the unique properties of multi-label stream data, such as large data volumes, label correlations and concept drifts. To address these challenges, in this paper, we propose an efficient and effective method for multi-label stream classification based on an ensemble of fading random trees. The proposed model can efficiently process high-speed multi-label stream data with concept drifts. Empirical studies on real-world tasks demonstrate that our method can maintain a high accuracy in multi-label stream classification, while providing a very efficient solution to the task.

Keywords: data stream, data mining, multi-label classification, random tree

Published: 2012-04-06
Publisher: IEEE

: http://dx.doi.org/10.4108/icst.collaboratecom.2011.247086

An Ensemble-based Approach to Fast Classification of Multi-label Data Streams

Abstract

About EAI

Community

Publish with EAI