Signal Processing and Information Technology. First International Joint Conference, SPIT 2011 and IPC 2011, Amsterdam, The Netherlands, December 1-2, 2011, Revised Selected Papers

Research Article

Multiband Curvelet-Based Technique for Audio Visual Recognition over Internet Protocol

Download
414 downloads
  • @INPROCEEDINGS{10.1007/978-3-642-32573-1_21,
        author={Sue Ch’ng and KahPhooi Seng and Fong Ong and Li-Minn Ang},
        title={Multiband Curvelet-Based Technique for Audio Visual Recognition over Internet Protocol},
        proceedings={Signal Processing and Information Technology. First International Joint Conference, SPIT 2011 and IPC 2011, Amsterdam, The Netherlands, December 1-2, 2011, Revised Selected Papers},
        proceedings_a={SPIT \& IPC},
        year={2012},
        month={10},
        keywords={curvelet transform multiband technique internet protocol windows sockets},
        doi={10.1007/978-3-642-32573-1_21}
    }
    
  • Sue Ch’ng
    KahPhooi Seng
    Fong Ong
    Li-Minn Ang
    Year: 2012
    Multiband Curvelet-Based Technique for Audio Visual Recognition over Internet Protocol
    SPIT & IPC
    Springer
    DOI: 10.1007/978-3-642-32573-1_21
Sue Ch’ng1,*, KahPhooi Seng2,*, Fong Ong1,*, Li-Minn Ang1,*
  • 1: University of Nottingham Malaysia Campus
  • 2: Sunway University
*Contact email: keyx9csi@nottingham.edu.my, Jasmine.Seng@nottingham.edu.my, keyx1ofe@nottingham.edu.my, Kenneth.Ang@nottingham.edu.my

Abstract

The transmission of the entire video and audio sequences over an internal or external network during the implementation of audio-visual recognition over internet protocol is inefficient especially when only selected data out of the entire video and audio sequences are actually used for the recognition process. Hence, in this paper, we propose an efficient method of implementing audio-visual recognition over internet protocol whereby only the extracted audio-visual features are transmitted over internet protocol. To extract the robust features from the video sequence, a multiband curvelet-based technique is employed at the client whereas a late multi-modal fusion scheme using RBF neural network is employed at the server to perform the recognition across both modalities. The proposed audio-visual recognition system is implemented on several standard audio-visual databases to showcase the efficiency of the system.