1st International ICST Workshop on Knowledge Discovery and Data Mining

Research Article

Mining High Utility Itemsets in Large High Dimensional Data

  • @INPROCEEDINGS{10.4108/wkdd.2008.2645,
        author={Guangzhu Yu and Keqing Li and Shihuang Shao},
        title={Mining High Utility Itemsets in Large High Dimensional Data},
        proceedings={1st International ICST Workshop on Knowledge Discovery and Data Mining},
        publisher={ACM},
        proceedings_a={WKDD},
        year={2010},
        month={5},
        keywords={},
        doi={10.4108/wkdd.2008.2645}
    }
    
  • Guangzhu Yu
    Keqing Li
    Shihuang Shao
    Year: 2010
    Mining High Utility Itemsets in Large High Dimensional Data
    WKDD
    ACM
    DOI: 10.4108/wkdd.2008.2645
Guangzhu Yu1,*, Keqing Li2,*, Shihuang Shao1,*
  • 1: Information and Technology College, Donghua University, Shanghai, China.
  • 2: Computer Technology College, Yangtze University, Jingzhou, China.
*Contact email: ygz@mail.dhu.edu.cn, likq03@126.com, guang216@126.com

Abstract

Existing algorithms for utility mining are inadequate on datasets with high dimensions or long patterns. This paper proposes a hybrid method, which is composed of a row enumeration algorithm (i.e., Inter-transaction) and a column enumeration algorithm (i.e., Two-phase), to discover high utility itemsets from two directions: Two-phase seeks short high utility itemsets from the bottom, while Intertransaction seeks long high utility itemsets from the top. In addition, optimization technique is adopted to improve the performance of computing the intersection of transactions. Experiments on synthetic data show that the hybrid method achieves high performance in large high dimensional datasets.