×
In this paper, we explore acceleration of DNNs using BCM on a state-of-the-art GPU. First, we identify the challenges posed by using BCMs.
Topics · Deep Neural Networks · General Matrix Multiplication · Graphics Processing Unit · Convolutional Layers · Block-circulant Matrices · Computational Complexity ...
One attractive approach is to leverage Block Circulant Matrices (BCM), compressing the linear transformation layers, e.g., convolutional and fully-connected ...
Dong, Shi, Zhao, Pu, Lin, Xue, and Kaeli, David. "Exploring GPU acceleration of Deep Neural Networks using Block Circulant Matrices". Parallel Computing 100 (C) ...
The fixed-point quantization and the proposed block-circulant matrix-based inference scheme enables the network to achieve as high as 3.5 TOPS computation ...
2021. Exploring GPU acceleration of Deep Neural Networks using Block Circulant Matrices. S Dong, P Zhao, X Lin, D Kaeli. Parallel Computing 100 (ISSN 0167-8191) ...
In con- trast, C CNN (b) uses the block-circulant matrix to avoid storage waste and achieve a ne-grained tradeo of accuracy and compres- sion/acceleration.
To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using block-circulant matrices ...
EXPLORING GPU ACCELERATION OF DNNS USING BLOCK CIRCULANT MATRICES. DNN training on a GPU can be highly inefficient, even causing a significant slowdown. In ...
Exploring GPU acceleration of Deep Neural Networks using Block Circulant Matrices. Parallel Computing. DNNMark. Configurable benchmark suite of Deep Neural ...