About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Information and Communication Technology for Development for Africa. First International Conference, ICT4DA 2017, Bahir Dar, Ethiopia, September 25–27, 2017, Proceedings

Research Article

Experimenting Statistical Machine Translation for Ethiopic Semitic Languages: The Case of Amharic-Tigrigna

Download(Requires a free EAI acccount)
803 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-319-95153-9_13,
        author={Michael Woldeyohannis and Million Meshesha},
        title={Experimenting Statistical Machine Translation for Ethiopic Semitic Languages: The Case of Amharic-Tigrigna},
        proceedings={Information and Communication Technology for Development for Africa. First International Conference, ICT4DA 2017, Bahir Dar, Ethiopia, September 25--27, 2017, Proceedings},
        proceedings_a={ICT4DA},
        year={2018},
        month={7},
        keywords={Under-resourced language Amharic-Tigrigna Semitic language Machine translation},
        doi={10.1007/978-3-319-95153-9_13}
    }
    
  • Michael Woldeyohannis
    Million Meshesha
    Year: 2018
    Experimenting Statistical Machine Translation for Ethiopic Semitic Languages: The Case of Amharic-Tigrigna
    ICT4DA
    Springer
    DOI: 10.1007/978-3-319-95153-9_13
Michael Woldeyohannis1,*, Million Meshesha1,*
  • 1: Addis Ababa University
*Contact email: michael.melese@aau.edu.et, million.meshesha@aau.edu.et

Abstract

In this research an attempt have been made to experiment on Amharic-Tigrigna machine translation for promoting information sharing. Since there is no Amharic-Tigrigna parallel text corpus, we prepared a parallel text corpus for Amharic-Tigrigna machine translation system from religious domain specifically from bible. Consequently, the data preparation involves sentence alignment, sentence splitting, tokenization, normalization of Amharic-Tigrigna parallel corpora and then splitting the dataset into training, tuning and testing data. Then, Amharic-Tigrigna translation model have been constructed using training data and further tuned for better translation. Finally, given target language model, the Amharic-Tigrigna translation system generates a target output with reference to translation model using word and morpheme as a unit. The result we found from the experiment is promising to design Amharic-Tigrigna machine translation system between resource deficient languages. We are now working on post-editing to enhance the performance of the bi-lingual Amharic-Tigrigna translator.

Keywords
Under-resourced language Amharic-Tigrigna Semitic language Machine translation
Published
2018-07-10
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-319-95153-9_13
Copyright © 2017–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL