A scalable LDPC decoder on GPU

Kiran Kumar Abburi

doi:10.1109/VLSID.2011.44

Profiles Research Units Publications

Conferences

A scalable LDPC decoder on GPU

Kiran Kumar Abburi

Published in

2011

DOI: 10.1109/VLSID.2011.44

Pages: 183 - 188

Abstract

A flexible and scalable approach for LDPC decoding on CUDA based Graphics Processing Unit (GPU) is presented in this paper. Layered decoding is a popular method for LDPC decoding and is known for its fast convergence. However, efficient implementation of the layered decoding algorithm on GPU is challenging due to the limited amount of data-parallelism available in this algorithm. To overcome this problem, a kernel execution configuration that can decode multiple codewords simultaneously on GPU is developed. This paper proposes a compact data packing scheme to reduce the number of global memory accesses and parity-check matrix representation to reduce constant memory latency. Global memory bandwidth efficiency is improved by coalescing simultaneous memory accesses of threads in a half-warp into a single memory transaction. Asynchronous data transfers are used to hide host memory latency by overlapping kernel execution with data transfers between CPU and GPU. The proposed implementation of LDPC decoder on GPU performs two orders of magnitude faster than the LDPC decoder on a CPU and four times faster than the previously reported LDPC decoder on GPU. This implementation achieves a throughput of 160Mbps, which is comparable to dedicated hardware solutions. © 2011 IEEE.

Topics: CUDA Pinned memory (58)%, Memory bandwidth (56)%, Sequential decoding (55)%, Low-density parity-check code (55)% and CAS latency (54)%

View more info for "A Scalable LDPC Decoder on GPU"

About the journal

Journal	Proceedings of the IEEE International Conference on VLSI Design
ISSN	10639667
Open Access	No

Concepts (25)

ASYNCHRONOUS DATA
CHECK MATRIXES
Code-words
DATA PACKING
DATA PARALLELISM
Dedicated hardware
Efficient implementation
FAST CONVERGENCE
Graphics processing unit
LAYERED DECODING
Ldpc decoder
MEMORY ACCESS
Memory bandwidths
MEMORY LATENCIES
Orders of magnitude
Scalable approach
Algorithms
Computer graphics equipment
Data transfer
Design
Embedded systems
Flocculation
Hardware
Program processors
Decoding

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND