About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Innovations and Interdisciplinary Solutions for Underserved Areas. 7th International Conference, InterSol 2024, Dakar, Senegal, July 3–4, 2024, Proceedings

Research Article

Beqi: Revitalize the Senegalese Wolof Language with a Robust Spelling Corrector

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-86493-3_25,
        author={Derguene Mbaye and Moussa Diallo},
        title={Beqi: Revitalize the Senegalese Wolof Language with a Robust Spelling Corrector},
        proceedings={Innovations and Interdisciplinary Solutions for Underserved Areas. 7th International Conference, InterSol 2024, Dakar, Senegal, July 3--4, 2024, Proceedings},
        proceedings_a={INTERSOL},
        year={2025},
        month={4},
        keywords={Spelling correction Spell checking Deep Learning LSTM Transformer Low-resource languages African languages Wolof},
        doi={10.1007/978-3-031-86493-3_25}
    }
    
  • Derguene Mbaye
    Moussa Diallo
    Year: 2025
    Beqi: Revitalize the Senegalese Wolof Language with a Robust Spelling Corrector
    INTERSOL
    Springer
    DOI: 10.1007/978-3-031-86493-3_25
Derguene Mbaye,*, Moussa Diallo
    *Contact email: derguenembaye@esp.sn

    Abstract

    The progress of Natural Language Processing (NLP), although fast in recent years, is not at the same pace for all languages. African languages in particular are still behind and lack automatic processing tools. Some of these tools are very important for the development of these languages but also have an important role in many NLP applications. This is particularly the case for automatic spell checkers. Several approaches have been studied to address this task and the one modeling spelling correction as a translation task from misspelled (noisy) text to well-spelled (correct) text shows promising results. However, this approach requires a parallel corpus of noisy data on the one hand and correct data on the other hand, whereas Wolof is a low-resource language and does not have such a corpus. In this paper, we present a way to address the constraint related to the lack of data by generating synthetic data and we present sequence-to-sequence models using Deep Learning for spelling correction in Wolof. We evaluated these models in three different scenarios depending on the subwording method applied to the data and showed that the latter had a significant impact on the performance of the models, which opens the way for future research in Wolof spelling correction.

    Keywords
    Spelling correction Spell checking Deep Learning LSTM Transformer Low-resource languages African languages Wolof
    Published
    2025-04-21
    Appears in
    SpringerLink
    http://dx.doi.org/10.1007/978-3-031-86493-3_25
    Copyright © 2024–2025 ICST
    EBSCOProQuestDBLPDOAJPortico
    EAI Logo

    About EAI

    • Who We Are
    • Leadership
    • Research Areas
    • Partners
    • Media Center

    Community

    • Membership
    • Conference
    • Recognition
    • Sponsor Us

    Publish with EAI

    • Publishing
    • Journals
    • Proceedings
    • Books
    • EUDL