About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Wireless Mobile Communication and Healthcare. 11th EAI International Conference, MobiHealth 2022, Virtual Event, November 30 – December 2, 2022, Proceedings

Research Article

A Review on Deep Learning-Based Automatic Lipreading

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-32029-3_17,
        author={Carlos Santos and Ant\^{o}nio Cunha and Paulo Coelho},
        title={A Review on Deep Learning-Based Automatic Lipreading},
        proceedings={Wireless Mobile Communication and Healthcare. 11th EAI International Conference, MobiHealth 2022, Virtual Event, November 30 -- December 2, 2022, Proceedings},
        proceedings_a={MOBIHEALTH},
        year={2023},
        month={5},
        keywords={Automatic Lip-reading Deep Learning Audio-visual Automatic Speech Recognition},
        doi={10.1007/978-3-031-32029-3_17}
    }
    
  • Carlos Santos
    António Cunha
    Paulo Coelho
    Year: 2023
    A Review on Deep Learning-Based Automatic Lipreading
    MOBIHEALTH
    Springer
    DOI: 10.1007/978-3-031-32029-3_17
Carlos Santos1, António Cunha2, Paulo Coelho1,*
  • 1: School of Technology and Management, Polytechnic of Leiria
  • 2: Escola de Ciências e Tecnologias, University of Trás-os-Montes e Alto Douro, Quinta de Prados
*Contact email: paulo.coelho@ipleiria.pt

Abstract

Automatic Lip-Reading (ALR), also known as Visual Speech Recognition (VSR), is the technological process to extract and recognize speech content, based solely on the visual recognition of the speaker’s lip movements. Besides hearing-impaired people, regular hearing people also resort to visual cues for word disambiguation, every time one is in a noisy environment. Due to the increasingly interest in developing ALR systems, a considerable number of research articles are being published. This article selects, analyses, and summarizes the main papers from 2018 to early 2022, from traditional methods with handcrafted feature extraction algorithms to end-to-end deep learning based ALR which fully take advantage of learning the best features, and of the evergrowing publicly available databases. By providing a recent state-of-the-art overview, identifying trends, and presenting a conclusion on what is to be expected in future work, this article becomes an efficient way to update on the most relevant ALR techniques.

Keywords
Automatic Lip-reading Deep Learning Audio-visual Automatic Speech Recognition
Published
2023-05-14
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-32029-3_17
Copyright © 2022–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL