Signal Processing and Information Technology. First International Joint Conference, SPIT 2011 and IPC 2011, Amsterdam, The Netherlands, December 1-2, 2011, Revised Selected Papers

Research Article

Using Suffix-Tree to Identify Patterns and Cluster Traces from Event Log

Download
416 downloads
  • @INPROCEEDINGS{10.1007/978-3-642-32573-1_20,
        author={Xiaodong Wang and Li Zhang and Hongming Cai},
        title={Using Suffix-Tree to Identify Patterns and Cluster Traces from Event Log},
        proceedings={Signal Processing and Information Technology. First International Joint Conference, SPIT 2011 and IPC 2011, Amsterdam, The Netherlands, December 1-2, 2011, Revised Selected Papers},
        proceedings_a={SPIT \& IPC},
        year={2012},
        month={10},
        keywords={Trace clustering Suffix tree Process mining},
        doi={10.1007/978-3-642-32573-1_20}
    }
    
  • Xiaodong Wang
    Li Zhang
    Hongming Cai
    Year: 2012
    Using Suffix-Tree to Identify Patterns and Cluster Traces from Event Log
    SPIT & IPC
    Springer
    DOI: 10.1007/978-3-642-32573-1_20
Xiaodong Wang1,*, Li Zhang2,*, Hongming Cai3,*
  • 1: University of Mannheim
  • 2: University Karlsruhe
  • 3: Shanghai JiaoTong University
*Contact email: wangxd.sjtu@googlemail.com, Li.Zhang@kit.edu, hmcai@sjtu.edu.cn

Abstract

Process mining refers to the extraction process models from event logs. Traditional process mining algorithms have problems dealing with event logs that are produced from unstructured real-life processes and generate spaghetti-like and incomprehensible process models. One means making traces more structural is to extract commonly used process model constructs (common patterns) in the event log and transform traces basing on such constructs. Another way of pre-processing traces is to categorize traces in event log into clusters such that process traces in each cluster can be adequately represented by a process model. Nevertheless, current approaches for trace clustering have many problems such as ignoring context process and huge computational overhead. In this paper, suffix-tree is firstly utilized for discovering common patterns. The traces in event log are transformed with common patterns. Thereafter suffix-trees are applied to categorize transformed traces. The trace clustering algorithm has a linear-time computational complexity. The process models mined from the clustered traces show a high degree of fitness and comprehensibility.