Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs

Supriya Movva; Saketh Prata; Sai Sampath; R. Gayathri

Ubiquitous Communications and Network Computing. Second EAI International Conference, Bangalore, India, February 8–10, 2019, Proceedings

Research Article

Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs

Download

249 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-20615-4_17,
    author={Supriya Movva and Saketh Prata and Sai Sampath and R. Gayathri},
    title={Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs},
    proceedings={Ubiquitous Communications and Network Computing. Second EAI International Conference, Bangalore, India, February 8--10, 2019, Proceedings},
    proceedings_a={UBICNET},
    year={2019},
    month={5},
    keywords={Frequent subgraph mining Subgraph Support threshold Hadoop framework Map-reduce},
    doi={10.1007/978-3-030-20615-4_17}
}

Supriya Movva
Saketh Prata
Sai Sampath
R. Gayathri
Year: 2019
Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs
UBICNET
Springer
DOI: 10.1007/978-3-030-20615-4_17

Supriya Movva¹^,*, Saketh Prata¹^,*, Sai Sampath¹, R. Gayathri¹

1: Amrita School of Engineering, Amritapuri, Amrita Vishwa Vidyapeetham

*Contact email: supriya.movva.212@gmail.com, pratasaketh@gmail.com

Abstract

The frequent subgraphs are the subgraphs which appear in a number, more than or equal to a user-defined threshold. Many algorithms assume that the apriori based approach yields an efficient result for finding frequent subgraphs, but in our research, we found out that Apriori algorithm lacks scalability with the main memory. Frequent subgraph mining using Apriori algorithm with FS tree uses adjacency list representation. FS tree is a prefix tree data structure. It implements the algorithm in two phases. In the first phase, it uses the Apriori algorithm to find frequent two edge subgraphs. In the second phase, it uses FS-tree algorithm to search all the frequent subgraphs from frequent two edge subgraphs. Scanning the dataset for every candidate is the drawback of the Apriori algorithm, so the Apriori algorithm with FS-tree is used to overcome the multiple scanning. This algorithm is also implemented in an assumption that the data set fits well in memory. In this paper, we propose parallel map-reduce based frequent subgraph mining technique performed in a distributed environment on the Hadoop framework. The experiments validate the efficiency of the algorithm for generating frequent subgraphs in large graph datasets.

Keywords: Frequent subgraph mining, Subgraph, Support threshold, Hadoop framework, Map-reduce

Published: 2019-05-16
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-20615-4_17

Prefix Tree Based MapReduce Approach for Mining Frequent Subgraphs

Abstract

About EAI

Community

Publish with EAI