Emerging Technologies in Computing. First International Conference, iCETiC 2018, London, UK, August 23–24, 2018, Proceedings

Research Article

Named Entity Recognition System for Sindhi Language

Download
265 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-95450-9_20,
        author={Awais Jumani and Mashooque Memon and Fida Khoso and Anwar Sanjrani and Safeeullah Soomro},
        title={Named Entity Recognition System for Sindhi Language},
        proceedings={Emerging Technologies in Computing. First International Conference, iCETiC 2018, London, UK, August 23--24, 2018, Proceedings},
        proceedings_a={ICETIC},
        year={2018},
        month={7},
        keywords={NER Sindhi NER Gazetteer based approach Rule based model},
        doi={10.1007/978-3-319-95450-9_20}
    }
    
  • Awais Jumani
    Mashooque Memon
    Fida Khoso
    Anwar Sanjrani
    Safeeullah Soomro
    Year: 2018
    Named Entity Recognition System for Sindhi Language
    ICETIC
    Springer
    DOI: 10.1007/978-3-319-95450-9_20
Awais Jumani1,*, Mashooque Memon2,*, Fida Khoso3,*, Anwar Sanjrani4,*, Safeeullah Soomro5,*
  • 1: Shah Abdul Latif University
  • 2: Benazir Bhutto Shaheed University
  • 3: Dawood University of Engineering and Technology
  • 4: University of Baluchistan
  • 5: AMA International University
*Contact email: awaisjumani@yahoo.com, pashamorai786@gmail.com, fidahussain.khoso@duet.edu.pk, anwar.csd@gmail.com, s.soomro@amaiu.edu.pk

Abstract

Named Entity Recognition (NER) System aims to extract the existing information into the following categories such as: Person’s Name, Organization, Location, Date and Time, Term, Designation and Short forms. Now, it is considered to be important aspect for many natural languages processing (NLP) tasks such as: information retrieval system, machine translation system, information extraction system and question answering. Even at a surface level, the understanding of the named entities involved in a document gives richer analytical framework and cross referencing. It has been used for different Arabic Script-Based languages like, Arabic, Persian and Urdu but, Sindhi could not come into being yet. This paper explains the problem of NER in the framework of Sindhi Language and provides relevant solution. The system is developed to tag ten different Named Entities. We have used Ruled based approach for NER system of Sindhi Language. For the training and testing, 936 words were used and calculated performance accuracy of 98.71%.