
Research Article
Building Word Representations for Wolof Using Neural Networks
@INPROCEEDINGS{10.1007/978-3-030-51051-0_20, author={Alla Lo and Cheikh M. Bamba Dione and Elhadji Mamadou Nguer and Sileye O. Ba and Moussa Lo}, title={Building Word Representations for Wolof Using Neural Networks}, proceedings={Innovations and Interdisciplinary Solutions for Underserved Areas. 4th EAI International Conference, InterSol 2020, Nairobi, Kenya, March 8-9, 2020, Proceedings}, proceedings_a={INTERSOL}, year={2020}, month={8}, keywords={Neural network Word embedding Low resource language Wolof}, doi={10.1007/978-3-030-51051-0_20} }
- Alla Lo
Cheikh M. Bamba Dione
Elhadji Mamadou Nguer
Sileye O. Ba
Moussa Lo
Year: 2020
Building Word Representations for Wolof Using Neural Networks
INTERSOL
Springer
DOI: 10.1007/978-3-030-51051-0_20
Abstract
Because a large portion of population in rural areas in sub Saharan Africa understand only local languages, they do not have access all to content available in the World Wide Web. Most content are available in English, Spanish, French, etc. Content in low-resource languages such as Wolof, which is mostly spoken in Senegal, are scarce. Automatic systems for natural language understanding such as machine translation systems that can transform information from common to low-resource languages would allow people in rural areas to access relevant scientific or health content.
Nowadays, word representation is the preliminary step of natural language understanding models. This paper presents investigations we conducted to build Wolof words representation using a corpus gathered from Internet. We applied neural word embedding models to the Wolof language corpus. These models are known to be able to capture into the embedding space semantic an syntactic relations between words. Experiments we conducted suggest that, despite a limited corpus size, our models successfully captures relations between words.