Header menu link for other important links
X
Ttlg-an efficient tensor transposition library for GPUs
Published in Institute of Electrical and Electronics Engineers Inc.
2018
Pages: 578 - 588
Abstract
This paper presents a Tensor Transposition Library for GPUs (TTLG). A distinguishing feature of TTLG is that it also includes a performance prediction model, which can be used by higher level optimizers that use tensor transposition. For example, tensor contractions are often implemented by using the TTGT (Transpose-Transpose-GEMM-Transpose) approach-transpose input tensors to a suitable layout and then use high-performance matrix multiplication followed by transposition of the result. The performance model is also used internally by TTLG for choosing among alternative kernels and/or slicing/blocking parameters for the transposition. TTLG is compared with current state-of-The-Art alternatives for GPUs. Comparable or better transposition times for the 'repeated-use' scenario and considerably better 'single-use' performance are observed. © 2018 IEEE.
Concepts (11)
  •  related image
    Graphics processing unit
  •  related image
    Program processors
  •  related image
    HIGH PERFORMANCE
  •  related image
    Optimizers
  •  related image
    PERFORMANCE MATRICES
  •  related image
    Performance model
  •  related image
    PERFORMANCE PREDICTION MODELS
  •  related image
    SINGLE USE
  •  related image
    State of the art
  •  related image
    TENSOR CONTRACTION
  •  related image
    TENSORS