About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Mobile Multimedia Communications. 16th EAI International Conference, MobiMedia 2023, Guilin, China, July 22-24, 2023, Proceedings

Research Article

L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-60347-1_14,
        author={Xiyu Song and Zhengyi An and Shiqi Wang and Fangzhi Yao and Mei Wang},
        title={L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment},
        proceedings={Mobile Multimedia Communications. 16th EAI International Conference, MobiMedia 2023, Guilin, China, July 22-24, 2023, Proceedings},
        proceedings_a={MOBIMEDIA},
        year={2024},
        month={10},
        keywords={reverberation environment speech separation time convolution network},
        doi={10.1007/978-3-031-60347-1_14}
    }
    
  • Xiyu Song
    Zhengyi An
    Shiqi Wang
    Fangzhi Yao
    Mei Wang
    Year: 2024
    L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment
    MOBIMEDIA
    Springer
    DOI: 10.1007/978-3-031-60347-1_14
Xiyu Song1, Zhengyi An1, Shiqi Wang1, Fangzhi Yao1, Mei Wang1,*
  • 1: Ministry of Education Key Laboratory of Cognitive Radio and Information Processing, Guilin University of Electronic Technology
*Contact email: mwang@glut.edu.cn

Abstract

Speech separation aims to separate a target speaker's speech from mixed speech. However, various noises and reverberations in real life make separation difficult. To solve this problem, a multi-channel microphone array is introduced to extract the spatial information of the target speech; however, the number of inter-channel phase differences (IPDs) increases linearly with the square of the number of microphones. Indeed, using all IPDs will impose a massive load on the system; therefore, We use the attention mechanism to effectively acquire IPD information. Moreover, the time convolution network (TCN) exhibits excellent performance in speech separation; however, a large number of parameters of deep dilated convolution results in a huge system burden. In summary, a speech separation method aided by effectively acquisition IPD information based on attention is proposed for a lightweight time convolution network (L-TCN). Compared with the control experiment, the proposed method reduces the parameters by 90% and doubles the utilization rate of the IPD. Based on the premise of reducing the system load, the short-time objective intelligence (STOI) increases by 0.19 and the scale-invariant signal to distortion ratio (SI-SDR) increases by 6.33.

Keywords
reverberation environment speech separation time convolution network
Published
2024-10-25
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-60347-1_14
Copyright © 2023–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL