A Complex Neural Network Adaptive Beamforming for Multi-channel Speech Enhancement in Time Domain

Tao Jiang; Hongqing Liu; Yi Zhou; Lu Gan

Communications and Networking. 16th EAI International Conference, ChinaCom 2021, Virtual Event, November 21-22, 2021, Proceedings

Research Article

A Complex Neural Network Adaptive Beamforming for Multi-channel Speech Enhancement in Time Domain

Download

8 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1007/978-3-030-99200-2_11,
    author={Tao Jiang and Hongqing Liu and Yi Zhou and Lu Gan},
    title={A Complex Neural Network Adaptive Beamforming for Multi-channel Speech Enhancement in Time Domain},
    proceedings={Communications and Networking. 16th EAI International Conference, ChinaCom 2021, Virtual Event, November 21-22, 2021, Proceedings},
    proceedings_a={CHINACOM},
    year={2022},
    month={4},
    keywords={End-to-end Multi-channel Speech enhancement Complex operations},
    doi={10.1007/978-3-030-99200-2_11}
}

Tao Jiang
Hongqing Liu
Yi Zhou
Lu Gan
Year: 2022
A Complex Neural Network Adaptive Beamforming for Multi-channel Speech Enhancement in Time Domain
CHINACOM
Springer
DOI: 10.1007/978-3-030-99200-2_11

Tao Jiang¹^,*, Hongqing Liu¹, Yi Zhou¹, Lu Gan²

1: School of Communication and Information Engineering
2: College of Engineering, Design and Physical Science, Brunel University

*Contact email: s190101065@stu.cqupt.edu.cn

Abstract

This paper presents a novel end-to-end multi-channel speech enhancement using complex time-domain operations. To that end, in time-domain, Hilbert transform is utilized to construct a complex time-domain analytic signal as the training inputs of the neural network. The proposed network system is composed of complex adaptive complex neural network beamforming and complex fully convolutional network (CNAB-CFCN). The real and imaginary parts (RI) of the clean speech analytic signal are used as training targets of the CNAB-CFCN network, and the weights of the CNAB-CFCN network are updated by calculating the scale invariant signal-to-distortion ratio (SI-SDR) loss function of the enhanced RI and clean RI. It is fundamentally different from the complex frequency domain single channel approach. The experimental results show that the proposed method demonstrates a significant improvement in end-to-end multi-channel speech enhancement scenarios.

Keywords: End-to-end Multi-channel Speech enhancement Complex operations

Published: 2022-04-05
Appears in: SpringerLink

: http://dx.doi.org/10.1007/978-3-030-99200-2_11

A Complex Neural Network Adaptive Beamforming for Multi-channel Speech Enhancement in Time Domain

Abstract

About EAI

Community

Publish with EAI