Ttlg-an efficient tensor transposition library for GPUs

Venkata Nandivada

doi:10.1109/IPDPS.2018.00067

Profiles Research Units Publications

Conferences

Ttlg-an efficient tensor transposition library for GPUs

Published in Institute of Electrical and Electronics Engineers Inc.

2018

DOI: 10.1109/IPDPS.2018.00067

Pages: 578 - 588

Abstract

This paper presents a Tensor Transposition Library for GPUs (TTLG). A distinguishing feature of TTLG is that it also includes a performance prediction model, which can be used by higher level optimizers that use tensor transposition. For example, tensor contractions are often implemented by using the TTGT (Transpose-Transpose-GEMM-Transpose) approach-transpose input tensors to a suitable layout and then use high-performance matrix multiplication followed by transposition of the result. The performance model is also used internally by TTLG for choosing among alternative kernels and/or slicing/blocking parameters for the transposition. TTLG is compared with current state-of-The-Art alternatives for GPUs. Comparable or better transposition times for the 'repeated-use' scenario and considerably better 'single-use' performance are observed. © 2018 IEEE.

Topics: Transpose (54)%, Tensor (52)% and Matrix multiplication (51)%

View more info for "TTLG - An Efficient Tensor Transposition Library for GPUs"

About the journal

Journal	Data powered by TypesetProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018
Publisher	Data powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
Open Access	No

Authors (1)

Venkata Nandivada
- Department of Computer Science and Engineering

Concepts (11)

Graphics processing unit
Program processors
HIGH PERFORMANCE
Optimizers
PERFORMANCE MATRICES
Performance model
PERFORMANCE PREDICTION MODELS
SINGLE USE
State of the art
TENSOR CONTRACTION
TENSORS

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND