BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU

Xiang Chen; Ji Zhu; Ziyu wen; Yu Wang; Huazhong Yang

8th International Conference on Communications and Networking in China

Research Article

BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1109/ChinaCom.2013.6694588,
    author={Xiang Chen and Ji Zhu and Ziyu wen and Yu Wang and Huazhong Yang},
    title={BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU},
    proceedings={8th International Conference on Communications and Networking in China},
    publisher={IEEE},
    proceedings_a={CHINACOM},
    year={2013},
    month={11},
    keywords={software radio turbo code gpu},
    doi={10.1109/ChinaCom.2013.6694588}
}

Xiang Chen
Ji Zhu
Ziyu wen
Yu Wang
Huazhong Yang
Year: 2013
BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU
CHINACOM
IEEE
DOI: 10.1109/ChinaCom.2013.6694588

Xiang Chen^,*, Ji Zhu¹, Ziyu wen², Yu Wang², Huazhong Yang²

1: Xidian University
2: Tsinghua University

*Contact email: chenxiang98@mails.tsinghua.edu.cn

Abstract

In this this paper, we present an optimized parallel implementation of a Bit Error Rate (BER) guaranteed turbo decoder on a General Purpose Graphic Process Unit (GPGPU). Actually, it is a critical task to implement complex communication signal processing over GPGPUs, since the parallelism over GPGPUs in general requires independent data streams for processing. So we explore both the inherent parallelisms and the extended sub-frame level parallelisms in turbo decoding and map them onto the recent GPU architecture. A guarding mechanism called Previous Iteration Value Initialization with Double Sided Training Window (PIVIDSTW) is used to minimize the loss of BER performance caused by sub-frame level parallelism, while the high throughput is still maintained. In addition, to explore the potential of parallelization in Turbo decoding on GPUs, the theoretical occupancy and scalability are analyzed with the consideration of the number of sub-frames per frame. Compared with previous work in [5] and [7], we achieve a better trade-off between BER performance and throughput concerns.

Keywords: software radio turbo code gpu

Published: 2013-11-14
Publisher: IEEE

: http://dx.doi.org/10.1109/ChinaCom.2013.6694588