L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment

Xiyu Song; Zhengyi An; Shiqi Wang; Fangzhi Yao; Mei Wang

Mobile Multimedia Communications. 16th EAI International Conference, MobiMedia 2023, Guilin, China, July 22-24, 2023, Proceedings

Research Article

L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-031-60347-1_14,
    author={Xiyu Song and Zhengyi An and Shiqi Wang and Fangzhi Yao and Mei Wang},
    title={L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment},
    proceedings={Mobile Multimedia Communications. 16th EAI International Conference, MobiMedia 2023, Guilin, China, July 22-24, 2023, Proceedings},
    proceedings_a={MOBIMEDIA},
    year={2024},
    month={10},
    keywords={reverberation environment speech separation time convolution network},
    doi={10.1007/978-3-031-60347-1_14}
}

Xiyu Song
Zhengyi An
Shiqi Wang
Fangzhi Yao
Mei Wang
Year: 2024
L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment
MOBIMEDIA
Springer
DOI: 10.1007/978-3-031-60347-1_14

Xiyu Song¹, Zhengyi An¹, Shiqi Wang¹, Fangzhi Yao¹, Mei Wang¹^,*

1: Ministry of Education Key Laboratory of Cognitive Radio and Information Processing, Guilin University of Electronic Technology

*Contact email: mwang@glut.edu.cn

Abstract

Speech separation aims to separate a target speaker's speech from mixed speech. However, various noises and reverberations in real life make separation difficult. To solve this problem, a multi-channel microphone array is introduced to extract the spatial information of the target speech; however, the number of inter-channel phase differences (IPDs) increases linearly with the square of the number of microphones. Indeed, using all IPDs will impose a massive load on the system; therefore, We use the attention mechanism to effectively acquire IPD information. Moreover, the time convolution network (TCN) exhibits excellent performance in speech separation; however, a large number of parameters of deep dilated convolution results in a huge system burden. In summary, a speech separation method aided by effectively acquisition IPD information based on attention is proposed for a lightweight time convolution network (L-TCN). Compared with the control experiment, the proposed method reduces the parameters by 90% and doubles the utilization rate of the IPD. Based on the premise of reducing the system load, the short-time objective intelligence (STOI) increases by 0.19 and the scale-invariant signal to distortion ratio (SI-SDR) increases by 6.33.

Keywords: reverberation environment, speech separation, time convolution network

Published: 2024-10-25
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-031-60347-1_14

L-TCN Speech Separation Algorithm for Effectively Acquisition IPD Information Based on Attention in Reverberation Environment

Abstract

About EAI

Community

Publish with EAI