
Research Article
An Effective Approach for Miningk-item High Utility Itemsets from Incremental Databases
@INPROCEEDINGS{10.1007/978-3-030-93179-7_8, author={Nong Thi Hoa and Nguyen Van Tao}, title={An Effective Approach for Miningk-item High Utility Itemsets from Incremental Databases}, proceedings={Context-Aware Systems and Applications. 10th EAI International Conference, ICCASA 2021, Virtual Event, October 28--29, 2021, Proceedings}, proceedings_a={ICCASA}, year={2022}, month={1}, keywords={High utility itemset HUI Mining high utility itemset Mining HUI k-item HUI Data mining}, doi={10.1007/978-3-030-93179-7_8} }
- Nong Thi Hoa
Nguyen Van Tao
Year: 2022
An Effective Approach for Miningk-item High Utility Itemsets from Incremental Databases
ICCASA
Springer
DOI: 10.1007/978-3-030-93179-7_8
Abstract
Mining High Utility Itemset (HUI) from incremental database discovers itemsets making much profit from newest transactions. Therefore, mining HUIs from incremental database are important for planing business. Previous studies on mining exact HUIs consume both time and memory for computing. Therefore, fast algorithms for mining compact HUIs have proposed. However, studies on mining compact HUIs still take a long time and consume much memory because of considering all itemsets of items in a transaction. Moreover, decision making in business is more effective based on HUIs containing several items. In this paper, we propose a novel effective algorithm for miningk-item HUIs that meets the need of decision makers and overcomes the limits of mining compact HUIs. We present a simple list to storek-itemsets appearing during scanning database. This list stores items and utility of each itemset. Our approach perform two ways of database segmentation to mine allk-itemsets. For each way of database segmentation, we run the following algorithm. It consists of two main steps including segmenting the current database to form sub-partitions and miningk-itemsets from each sub-partition.k-item HUIs are extracted from the list based on the utility. The proposed algorithm obtain advantages including without candidate generation and without re-scanning when changing the threshold of utility. Experiments are conducted on dense benchmark databases. Results of experiments show that our algorithm is better than state-of-the-art methods.