Proceedings of the 1st International Conference on Islam, Science and Technology, ICONISTECH 2019, 11-12 July 2019, Bandung, Indonesia.

Research Article

File Training Generator For Indonesian Language In Named Entity Recognition Using Anago Library

Download346 downloads
  • @INPROCEEDINGS{10.4108/eai.11-7-2019.2297618,
        author={Irfan  Fadil and Dwi  Yuniarto and Esa  Firmansyah and Dody  Herdiana and Fidi  Supriadi and Ali  Rahman},
        title={File Training Generator For Indonesian Language In Named Entity Recognition Using Anago Library},
        proceedings={Proceedings of the 1st International Conference on Islam, Science and Technology, ICONISTECH 2019, 11-12 July 2019, Bandung, Indonesia.},
        publisher={EAI},
        proceedings_a={ICONISTECH},
        year={2021},
        month={1},
        keywords={machine learning named entity recognition anago library data train},
        doi={10.4108/eai.11-7-2019.2297618}
    }
    
  • Irfan Fadil
    Dwi Yuniarto
    Esa Firmansyah
    Dody Herdiana
    Fidi Supriadi
    Ali Rahman
    Year: 2021
    File Training Generator For Indonesian Language In Named Entity Recognition Using Anago Library
    ICONISTECH
    EAI
    DOI: 10.4108/eai.11-7-2019.2297618
Irfan Fadil1,*, Dwi Yuniarto1, Esa Firmansyah1, Dody Herdiana1, Fidi Supriadi1, Ali Rahman1
  • 1: STMIK Sumedang, UIN Sunan Gunung Djati Bandung
*Contact email: fadilirfan@stmik-sumedang.ac.id

Abstract

Named Entity Recognition (NER) or Named Entity Recognition and Classification (NERC) is one of the main components of an information extraction task that aims to detect and categorize named entities in a text. NER is generally used to detect people's names, place names, and organization of a document, but can also be extended to identify genes, proteins, and others as needed. NER is useful in many NLP (Natural Language Processing) applications such as question-answering, summaries, and dialog systems because it can reduce ambiguity. NER also deals with other information extraction tasks such as relation detection, event detection, and temporal analysis. To avoid this need to train data source. The data train can be taken from various sources of news/articles crawled on the internet. The news will then be annotated by users with various labels. The news/article sources are in the thousands, while to make this training by using file is manual. And sometimes there is an error because this manual was made when it will form the NER model as needed. This research will be made so that training files can be assisted by using applications so that the error rate can be smaller or there will be no errors.