Header menu link for other important links
X
A scalable LDPC decoder on GPU
Kiran Kumar Abburi
Published in
2011
Pages: 183 - 188
Abstract
A flexible and scalable approach for LDPC decoding on CUDA based Graphics Processing Unit (GPU) is presented in this paper. Layered decoding is a popular method for LDPC decoding and is known for its fast convergence. However, efficient implementation of the layered decoding algorithm on GPU is challenging due to the limited amount of data-parallelism available in this algorithm. To overcome this problem, a kernel execution configuration that can decode multiple codewords simultaneously on GPU is developed. This paper proposes a compact data packing scheme to reduce the number of global memory accesses and parity-check matrix representation to reduce constant memory latency. Global memory bandwidth efficiency is improved by coalescing simultaneous memory accesses of threads in a half-warp into a single memory transaction. Asynchronous data transfers are used to hide host memory latency by overlapping kernel execution with data transfers between CPU and GPU. The proposed implementation of LDPC decoder on GPU performs two orders of magnitude faster than the LDPC decoder on a CPU and four times faster than the previously reported LDPC decoder on GPU. This implementation achieves a throughput of 160Mbps, which is comparable to dedicated hardware solutions. © 2011 IEEE.
About the journal
JournalProceedings of the IEEE International Conference on VLSI Design
ISSN10639667
Open AccessNo
Concepts (25)
  •  related image
    ASYNCHRONOUS DATA
  •  related image
    CHECK MATRIXES
  •  related image
    Code-words
  •  related image
    DATA PACKING
  •  related image
    DATA PARALLELISM
  •  related image
    Dedicated hardware
  •  related image
    Efficient implementation
  •  related image
    FAST CONVERGENCE
  •  related image
    Graphics processing unit
  •  related image
    LAYERED DECODING
  •  related image
    Ldpc decoder
  •  related image
    MEMORY ACCESS
  •  related image
    Memory bandwidths
  •  related image
    MEMORY LATENCIES
  •  related image
    Orders of magnitude
  •  related image
    Scalable approach
  •  related image
    Algorithms
  •  related image
    Computer graphics equipment
  •  related image
    Data transfer
  •  related image
    Design
  •  related image
    Embedded systems
  •  related image
    Flocculation
  •  related image
    Hardware
  •  related image
    Program processors
  •  related image
    Decoding