LIPCOORDNET: A DUAL-STREAM DEEP LEARNING ARCHITECTURE FOR VISUAL SPEECH RECOGNITION USING FACIAL LANDMARKS

Wissem Karous; Hanen Lajnef; Tehani Dammak

doi:10.22452/

Authors

Wissem Karous School of Electronics and Telecommunications, University of Sfax, Tunisia
Hanen Lajnef Innov’COM Laboratory, Sup’Com, University of Carthage, Ariana, Tunisia Corresponding Author
Tehani Dammak School of Electronics and Telecommunications, University of Sfax, Tunisia

DOI:

https://doi.org/10.22452/

Keywords:

Lip reading, Deep learning, 3D CNN, LSTM, Facial landmarks, Multi-modal fusion, Visual speech recognition

Abstract

Automated lip reading systems have emerged as critical assistive technologies for hearing-impaired individuals and communication in noisy environments. This research presents an advanced deep learning framework for sentence-level lip reading that integrates 3D Convolutional Neural Networks (3D CNN) with bidirectional Long Short-Term Memory (Bi-LSTM) networks, enhanced by facial landmark coordinates as supplementary input features. Our proposed LipCoordNet architecture achieves state-of-the-art performance on the GRID corpus benchmark, obtaining a Word Error Rate (WER) of 1.7% and Character Error Rate (CER) of 0.6%, representing significant improvements over existing state-of-the-art methodologies evaluated on the same dataset. The system demonstrates robust performance through the integration of spatial-temporal visual features and geometric lip movement patterns, validated through comprehensive experiments including statistical significance testing across five independent runs, and deployed as an interactive demonstration platform.

LIPCOORDNET: A DUAL-STREAM DEEP LEARNING ARCHITECTURE FOR VISUAL SPEECH RECOGNITION USING FACIAL LANDMARKS

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Editorial Information

Scope

Submission Guidelines

Indexing

Article Publication Charge

Journal Template

Special Issue

In Press Publication

Awards

Information

Conference

Articles

Top Cited Articles

Most View Articles

Publishing Timeline