Research Article
Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs
@INPROCEEDINGS{10.1007/978-3-030-20615-4_17, author={Supriya Movva and Saketh Prata and Sai Sampath and R. Gayathri}, title={Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs}, proceedings={Ubiquitous Communications and Network Computing. Second EAI International Conference, Bangalore, India, February 8--10, 2019, Proceedings}, proceedings_a={UBICNET}, year={2019}, month={5}, keywords={Frequent subgraph mining Subgraph Support threshold Hadoop framework Map-reduce}, doi={10.1007/978-3-030-20615-4_17} }
- Supriya Movva
Saketh Prata
Sai Sampath
R. Gayathri
Year: 2019
Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs
UBICNET
Springer
DOI: 10.1007/978-3-030-20615-4_17
Abstract
The frequent subgraphs are the subgraphs which appear in a number, more than or equal to a user-defined threshold. Many algorithms assume that the apriori based approach yields an efficient result for finding frequent subgraphs, but in our research, we found out that Apriori algorithm lacks scalability with the main memory. Frequent subgraph mining using Apriori algorithm with FS tree uses adjacency list representation. FS tree is a prefix tree data structure. It implements the algorithm in two phases. In the first phase, it uses the Apriori algorithm to find frequent two edge subgraphs. In the second phase, it uses FS-tree algorithm to search all the frequent subgraphs from frequent two edge subgraphs. Scanning the dataset for every candidate is the drawback of the Apriori algorithm, so the Apriori algorithm with FS-tree is used to overcome the multiple scanning. This algorithm is also implemented in an assumption that the data set fits well in memory. In this paper, we propose parallel map-reduce based frequent subgraph mining technique performed in a distributed environment on the Hadoop framework. The experiments validate the efficiency of the algorithm for generating frequent subgraphs in large graph datasets.