
Research Article
The Need for a Novel Approach to Design Derivation Lexicon for Semitic Languages
@INPROCEEDINGS{10.1007/978-3-030-93709-6_35, author={Enchalew Y. Ayalew and Laure Vieu and Million M. Beyene}, title={The Need for a Novel Approach to Design Derivation Lexicon for Semitic Languages}, proceedings={Advances of Science and Technology. 9th EAI International Conference, ICAST 2021, Hybrid Event, Bahir Dar, Ethiopia, August 27--29, 2021, Proceedings, Part I}, proceedings_a={ICAST}, year={2022}, month={1}, keywords={Semitic computational morphology Lexicon design Derivation lexicon}, doi={10.1007/978-3-030-93709-6_35} }
- Enchalew Y. Ayalew
Laure Vieu
Million M. Beyene
Year: 2022
The Need for a Novel Approach to Design Derivation Lexicon for Semitic Languages
ICAST
Springer
DOI: 10.1007/978-3-030-93709-6_35
Abstract
Morphology knowledge is relevant in language learning, information retrieval and natural language processing. Derivation lexicons are comprehensive and organized collections of the morphological variants of a language’s vocabulary. These lexicons can be developed through either analysis-based synthesis of large text corpora or synthesis of surface forms from roots, stems, lemmas and morphological rules. Much of the research in developing derivation lexicon for Indo-European languages, which are concatenative, focus on analysis-based synthesis, as they do have well-developed preprocessing tools and organized text corpora. However, the methods for these languages are not appropriate for non-concatenative languages such as Semitic languages. Moreover, most of the Semitic languages, except Arabic and Hebrew, do not have well-developed text corpora and language processing tools. Hence, a novel approach that can cater for the root-pattern and rich morphology of these languages is necessary. This paper is therefore both a comprehensive survey of the literature and an analysis, motivating morphological synthesis approach coupled with a novel architecture with illustration. It is part of a larger project tailored for designing an innovative, generic, approach to derivation lexicon development for Semitic languages.