TextRank – Based Keyword Extraction for Constructing a Domain-Specific Dictionary

Sridevi Bonthu; Hema Sankar Sai Ganesh Babu Muddam; Koushik Varma Mudunuri; Abhinav Dayal; V. V. R. Maheswara Rao; Bharat Kumar Bolla

Cognitive Computing and Cyber Physical Systems. 4th EAI International Conference, IC4S 2023, Bhimavaram, Andhra Pradesh, India, August 4-6, 2023, Proceedings, Part I

Research Article

TextRank – Based Keyword Extraction for Constructing a Domain-Specific Dictionary

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-48888-7_29,
    author={Sridevi Bonthu and Hema Sankar Sai Ganesh Babu Muddam and Koushik Varma Mudunuri and Abhinav Dayal and V. V. R. Maheswara Rao and Bharat Kumar Bolla},
    title={TextRank -- Based Keyword Extraction for Constructing a Domain-Specific Dictionary},
    proceedings={Cognitive Computing and Cyber Physical Systems. 4th EAI International Conference, IC4S 2023, Bhimavaram, Andhra Pradesh, India, August 4-6, 2023, Proceedings, Part I},
    proceedings_a={IC4S},
    year={2024},
    month={1},
    keywords={Extraction TextRank POS tagging Text mining domain-specific dictionary Natural Language Processing},
    doi={10.1007/978-3-031-48888-7_29}
}

Sridevi Bonthu
Hema Sankar Sai Ganesh Babu Muddam
Koushik Varma Mudunuri
Abhinav Dayal
V. V. R. Maheswara Rao
Bharat Kumar Bolla
Year: 2024
TextRank – Based Keyword Extraction for Constructing a Domain-Specific Dictionary
IC4S
Springer
DOI: 10.1007/978-3-031-48888-7_29

Sridevi Bonthu¹^,*, Hema Sankar Sai Ganesh Babu Muddam², Koushik Varma Mudunuri¹, Abhinav Dayal¹, V. V. R. Maheswara Rao³, Bharat Kumar Bolla⁴

1: Computer Science and Engineering Department, Vishnu Institute of Technology
2: Tata Consultancy Services
3: Computer Science and Engineering Department, Shri Vishnu Engineering College for Women
4: University of Arizona, Tucson

*Contact email: sridevi.b@vishnu.edu.in

Abstract

Extracting domain-related keywords from text documents is a crucial task in both Information Retrieval and Natural Language Processing (NLP). This paper presents an approach that combines the TextRank algorithm with various NLP techniques to effectively identify domain-specific keywords. Our method utilizes the power of unsupervised graph-based ranking algorithms and the semantic understanding of NLP models to extract key terms that are highly relevant to a specific domain. The work is carried out on an arXiv research abstract dataset. This work preprocesses the input text to capture linguistic features, extracts the keywords using TextRank and POS filtering approaches, extracts the definitions and finally evaluates the performance. The performance of the extracted keywords is done with the help of manually annotated labels. The proposed method has obtained 83% accuracy. The proposed approach is flexible and adaptable to different domains, as it can be trained on domain-specific data to further improve its performance.

Keywords: Extraction, TextRank, POS tagging, Text mining, domain-specific dictionary, Natural Language Processing

Published: 2024-01-05
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-48888-7_29

TextRank – Based Keyword Extraction for Constructing a Domain-Specific Dictionary

Abstract

About EAI

Community

Publish with EAI