
Research Article
Evaluation of Corpora, Resources and Tools for Amharic Information Retrieval
@INPROCEEDINGS{10.1007/978-3-030-80621-7_34, author={Tilahun Yeshambel and Josiane Mothe and Yaregal Assabie}, title={Evaluation of Corpora, Resources and Tools for Amharic Information Retrieval}, proceedings={Advances of Science and Technology. 8th EAI International Conference, ICAST 2020, Bahir Dar, Ethiopia, October 2-4, 2020, Proceedings, Part I}, proceedings_a={ICAST}, year={2021}, month={7}, keywords={Amharic language Amharic NLP tools Amharic resources Challenges of Amharic language processing Morphological complexity}, doi={10.1007/978-3-030-80621-7_34} }
- Tilahun Yeshambel
Josiane Mothe
Yaregal Assabie
Year: 2021
Evaluation of Corpora, Resources and Tools for Amharic Information Retrieval
ICAST
Springer
DOI: 10.1007/978-3-030-80621-7_34
Abstract
Amharic is the working language of Ethiopia. It is the second-most commonly spoken Semitic language in the world next to Arabic. Amharic is morphologically complex and under-resourced, which poses tremendous challenges for natural language processing. The development of fully functional Amharic text processing applications is a non-trivial task for researchers and developers. Despite attempts to develop some applications, lack of standards in corpus collection and resource development resulted in the problem of interoperability. The aim of this paper is to present and evaluate the accessibility of Amharic corpora, resources and tools with the purpose of highlighting the status of Amharic language processing applications. We present available resources and linguistic tools, assess their usability and effectiveness, investigate the implications of the morphological complexity and put the way forward in the development of Amharic text processing applications.