Header menu link for other important links
X
Lossless Parallel Implementation of a Turbo Decoder on GPU
Published in Institute of Electrical and Electronics Engineers Inc.
2019
Pages: 133 - 142
Abstract
Turbo decoders use the recursive BCJR algorithm which is computationally intensive and hard to parallelise. The branch metric and extrinsic log-likelihood ratio computations are easily parallelisable, but the forward and backward metric computation is not parallelisable without compromising bit error rate. This paper proposes a lossless parallelisation technique for Turbo decoders on Graphics Processing Units (GPU). The recursive forward and backward metric computation is formulated as prefix (scan) matrix multiplication problem which is computed on the GPU using parallel prefix sum computation technique. Overall, this method achieves a throughput of 73 Mbps for a 3GPP LTE compliant turbo decoder without any BER loss and latency as low as 61 μs. © 2018 IEEE.
Concepts (11)
  •  related image
    Bit error rate
  •  related image
    Computer graphics
  •  related image
    Decoding
  •  related image
    Mobile telecommunication systems
  •  related image
    Program processors
  •  related image
    BCJR
  •  related image
    CUDA
  •  related image
    GPGPU
  •  related image
    SCAN
  •  related image
    TURBO DECODERS
  •  related image
    Graphics processing unit