Fast and Low-Power Quantized Fixed Posit High-Accuracy DNN Implementation
Source
IEEE Transactions on Very Large Scale Integration VLSI Systems
ISSN
10638210
Date Issued
2022-01-01
Author(s)
Abstract
This brief compares quantized float-point representation in posit and fixed-posit formats for a wide variety of pre-trained deep neural networks (DNNs). We observe that fixed-posit representation is far more suitable for DNNs as it results in a faster and low-power computation circuit. We show that accuracy remains within the range of 0.3% and 0.57% of top-1 accuracy for posit and fixed-posit quantization. We further show that the posit-based multiplier requires higher power-delay-product (PDP) and area, whereas fixed-posit reduces PDP and area consumption by 71% and 36%, respectively, compared to (Devnath et al., 2020) for the same bit-width.
Subjects
Convolutional neural net (CNN) | deep neural network (DNN) | fixed-posit representation | posit number system | quantization
