Quality of Service in Heterogeneous Networks. 6th International ICST Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness, QShine 2009 and 3rd International Workshop on Advanced Architectures and Algorithms for Internet Delivery and Applications, AAA-IDEA 2009, Las Palmas, Gran Canaria, November 23-25, 2009 Proceedings

Research Article

On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing

Download
417 downloads
  • @INPROCEEDINGS{10.1007/978-3-642-10625-5_43,
        author={Hsiao-Pu Lin and Hung-Yun Hsieh},
        title={On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing},
        proceedings={Quality of Service in Heterogeneous Networks. 6th International ICST Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness, QShine 2009 and 3rd International Workshop on Advanced Architectures and Algorithms for Internet Delivery and Applications, AAA-IDEA 2009, Las Palmas, Gran Canaria, November 23-25, 2009 Proceedings},
        proceedings_a={QSHINE},
        year={2012},
        month={10},
        keywords={Overlay video conference heterogeneous telephony device dual-mode phone VoIP},
        doi={10.1007/978-3-642-10625-5_43}
    }
    
  • Hsiao-Pu Lin
    Hung-Yun Hsieh
    Year: 2012
    On Using Digital Speech Processing Techniques for Synchronization in Heterogeneous Teleconferencing
    QSHINE
    Springer
    DOI: 10.1007/978-3-642-10625-5_43
Hsiao-Pu Lin1, Hung-Yun Hsieh,*
  • 1: Graduate Institute of Communication Engineering
*Contact email: hyhsieh@cc.ee.ntu.edu.tw

Abstract

As the popularity of multi-functional communication devices grows, traditional audio conferencing now may involve heterogeneous teleconferencing devices, including POTS phone, VoIP phones, dual-mode smart phones, and so on. During a multi-party audio conference involving heterogeneous devices, it is possible that a video conference is held concurrently involving a subset of devices capable of processing video streams for better the conferencing experience. In such a scenario, the need for synchronization between circuit-switched audio streams and packet-switched video streams arises. While the problem of audio-video synchronization has been extensively investigated in related work, existing solutions are limited to synchronization in packet-data networks and hence are not applicable in the target environment. In this work, we consider the problem of supporting such an overlay video conference among dual-mode phones. We first transform the audio-video synchronization problem into the problem of synchronizing circuit-switched and packet-switched audio streams. We then propose an end-to-end solution for audio synchronization that is transparent to the heterogeneous network protocol suites involved. We investigate synchronization algorithms based on digital speech processing using different acoustic features of the speech signal in the waveform, cepstrum, and spectrum domains. We evaluate the effectiveness of different algorithms under various impairments including codec distortion, line noises, packet losses, and overlapping utterances. Evaluation results show a promising direction for using DSP-based algorithms to address the synchronization problem across heterogeneous telephony systems.