Wireless Internet. 9th International Conference, WICON 2016, Haikou, China, December 19-20, 2016, Proceedings

Research Article

An Optimization of DBN/GPU Speech Recognition on Wireless Network Applications

Download
322 downloads
  • @INPROCEEDINGS{10.1007/978-3-319-72998-5_20,
        author={Weipeng Jing and Tao Jiang and Yaqiu Liu},
        title={An Optimization of DBN/GPU Speech Recognition on Wireless Network Applications},
        proceedings={Wireless Internet. 9th International Conference, WICON 2016, Haikou, China, December 19-20, 2016, Proceedings},
        proceedings_a={WICON},
        year={2018},
        month={1},
        keywords={Wireless networks Speech recognition DBN GPU Parallel computation Mobile computing},
        doi={10.1007/978-3-319-72998-5_20}
    }
    
  • Weipeng Jing
    Tao Jiang
    Yaqiu Liu
    Year: 2018
    An Optimization of DBN/GPU Speech Recognition on Wireless Network Applications
    WICON
    Springer
    DOI: 10.1007/978-3-319-72998-5_20
Weipeng Jing,*, Tao Jiang1,*, Yaqiu Liu1,*
  • 1: Northeast Forestry University
*Contact email: weipeng.jing@outlook.com, taojiang920619@outlook.com, yaqiuLiu@gmail.com

Abstract

With the development of wireless networks and mobile computing, using speech recognition with wireless networks in mobile terminals to process data has become a new trend in mobile computing and achieved great success. Therefore, how to improve the speed of training speech recognition is still a problem worth studying. Using GPU to accelerate the training of speech recognition based on Deep Belief Network (DBN) has achieved great success, but there exits some problems. Aiming the problems that single GPU can not store huge parameters of DBM at one time and the unreasonable usage of GPU’s memory model, we propose a new method in this paper. We divide the weight matrix into blocks, take the connections between visible units and hidden unit as threads and store the weight matrix into shared memory of GPU, establishing a reasonable memory model. Experimental results show that the optimized GPU implementation achieves 223 times and 1.5 times acceleration compared to single CPU and single GPU in Kaldi respectively, which demonstrate that our method can improve the DBN’s training speed in mobile computing without GPU memory limitation.