Fast and low-power quantized fixed posit high-accuracy DNN implementation

Show simple item record

dc.contributor.author Walia, Sumit
dc.contributor.author Tej, Bachu Varun
dc.contributor.author Kabra, Arpita
dc.contributor.author Devnath, Joydeep
dc.contributor.author Mekie, Joycee
dc.coverage.spatial United States of America
dc.date.accessioned 2021-12-24T11:50:55Z
dc.date.available 2021-12-24T11:50:55Z
dc.date.issued 2022-01
dc.identifier.citation Walia, Sumit; Tej, Bachu Varun; Kabra, Arpita; Devnath, Joydeep and Mekie, Joycee, “Fast and low-power quantized fixed posit high-accuracy DNN implementation”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, DOI: 10.1109/TVLSI.2021.3131609, vol. 30, no. 1, pp. 108-111, Jan. 2022. en_US
dc.identifier.issn 1557-9999
dc.identifier.issn 1063-8210
dc.identifier.uri https://doi.org/10.1109/TVLSI.2021.3131609
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/7370
dc.description.abstract This brief compares quantized float-point representation in posit and fixed-posit formats for a wide variety of pre-trained deep neural networks (DNNs). We observe that fixed-posit representation is far more suitable for DNNs as it results in a faster and low-power computation circuit. We show that accuracy remains within the range of 0.3% and 0.57% of top-1 accuracy for posit and fixed-posit quantization. We further show that the posit-based multiplier requires higher power-delay-product (PDP) and area, whereas fixed-posit reduces PDP and area consumption by 71% and 36%, respectively, compared to (Devnath et al., 2020) for the same bit-width.
dc.description.statementofresponsibility by Sumit Walia, Bachu Varun Tej, Arpita Kabra, Joydeep Devnath and Joycee Mekie
dc.format.extent vol. 30, no. 1, pp. 108-111
dc.language.iso en_US en_US
dc.publisher Institute of Electrical and Electronics Engineers en_US
dc.subject Convolutional neural net (CNN) en_US
dc.subject Deep neural network (DNN) en_US
dc.subject Fixed-posit representation en_US
dc.subject Posit number system en_US
dc.subject Quantization en_US
dc.title Fast and low-power quantized fixed posit high-accuracy DNN implementation en_US
dc.type Article en_US
dc.relation.journal IEEE Transactions on Very Large Scale Integration (VLSI) Systems


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account