Research Article
An Ensemble-based Approach to Fast Classification of Multi-label Data Streams
@INPROCEEDINGS{10.4108/icst.collaboratecom.2011.247086, author={Xiangnan Kong and Philip Yu}, title={An Ensemble-based Approach to Fast Classification of Multi-label Data Streams}, proceedings={7th International Conference on Collaborative Computing: Networking, Applications and Worksharing}, publisher={IEEE}, proceedings_a={COLLABORATECOM}, year={2012}, month={4}, keywords={data stream data mining multi-label classification random tree}, doi={10.4108/icst.collaboratecom.2011.247086} }
- Xiangnan Kong
Philip Yu
Year: 2012
An Ensemble-based Approach to Fast Classification of Multi-label Data Streams
COLLABORATECOM
ICST
DOI: 10.4108/icst.collaboratecom.2011.247086
Abstract
Network operators are continuously confronted with online events, such as online messages, blog updates, etc. Due to the huge volume of these events and the fast changes of the topics, it is critical to manage them promptly and effectively. There have been many softwares and algorithms developed to conduct automatic classification over these stream data. Conventional approaches focus on single-label scenarios, where each event can only be tagged with one label. However, in many stream data, each event can be tagged with more than one labels. Effective stream classification systems should be able to consider the unique properties of multi-label stream data, such as large data volumes, label correlations and concept drifts. To address these challenges, in this paper, we propose an efficient and effective method for multi-label stream classification based on an ensemble of fading random trees. The proposed model can efficiently process high-speed multi-label stream data with concept drifts. Empirical studies on real-world tasks demonstrate that our method can maintain a high accuracy in multi-label stream classification, while providing a very efficient solution to the task.