EAI Endorsed Transactions on Context-aware Systems and Applications 18(13): e5

Research Article

Exploiting Nonnegative Matrix Factorization with Mixed Group Sparsity Constraint to Separate Speech Signal from Single-channel Mixture with Unknown Ambient Noise

Download85 downloads
  • @ARTICLE{10.4108/eai.14-3-2018.154342,
        author={Thanh Thi Hien Duong and Phuong Cong Nguyen and Cuong Quoc Nguyen},
        title={Exploiting Nonnegative Matrix Factorization with Mixed Group Sparsity Constraint to Separate Speech Signal from Single-channel Mixture with Unknown Ambient Noise},
        journal={EAI Endorsed Transactions on Context-aware Systems and Applications},
        volume={18},
        number={13},
        publisher={EAI},
        journal_a={CASA},
        year={2018},
        month={3},
        keywords={Speech enhancement, source separation, nonnegative matrix factorizarion (NMF), sparsity constraint, generic source spectral model},
        doi={10.4108/eai.14-3-2018.154342}
    }
    
  • Thanh Thi Hien Duong
    Phuong Cong Nguyen
    Cuong Quoc Nguyen
    Year: 2018
    Exploiting Nonnegative Matrix Factorization with Mixed Group Sparsity Constraint to Separate Speech Signal from Single-channel Mixture with Unknown Ambient Noise
    CASA
    EAI
    DOI: 10.4108/eai.14-3-2018.154342
Thanh Thi Hien Duong1,*, Phuong Cong Nguyen2, Cuong Quoc Nguyen3
  • 1: International Research Institute MICA, Hanoi University of Science and Technology, Vietnam, Dept. of Information Technology, Hanoi University of Mining and Geology, Vietnam
  • 2: International Research Institute MICA, Hanoi University of Science and Technology, Vietnam, Dept. of Instrumentation and Industrial Informatic, Hanoi University of Science and Technology, Vietnam
  • 3: Dept. of Instrumentation and Industrial Informatic, Hanoi University of Science and Technology, Vietnam
*Contact email: duongthihienthanh@humg.edu.vn

Abstract

This paper focuses on solving a challenging speech enhancement problem: improving the desired speech from a single-channel audio signal containing high-level unspecified noise (possibly environmental noise, music, other sounds, etc.). Using source separation technique, we investigate a solution combining nonnegative matrix factorization (NMF) with mixed group sparsity constraint that allows exploiting generic noise spectral model to guide the separation process. The experiment performed on a set of benchmarked audio signals with different types of real-world noise shows that the proposed algorithm yields better quantitative results in term of the signal-to-distortion ratio than the previously published algorithms.