An Automated Approach to Compare Bit Serial and Bit Parallel In-Memory Computing for DNNs
Source
Proceedings IEEE International Symposium on Circuits and Systems
ISSN
02714310
Date Issued
2022-01-01
Author(s)
Abstract
This paper presents an exhaustive comparison of two different techniques for In-Memory Computing in SRAM: bit-serial arithmetic (BSA), and bit-parallel arithmetic (BPA). We have modeled both BSA and BPA and integrated them with CACTI for the CMOS 28nm technology node. The results are analyzed for both approaches with ten different sub-array configurations ranging from 128x128 to 2048x2048. We performed the convolution operation on the ImageNet dataset for comparison. The key observation is that the BPA begins to yield at least 25% better (lower) Energy Delay Product (EDP) as compared to that in BSA for large (2048x2048) sub-array sizes. The BSA yields 4 times lower delay on an average, while the BPA yields sim 6 times lower dynamic energy. Hence, the choice of the IMC architecture needs to be made depending on the application need (low energy/high performance). We have also simulated 12 different multi-bank IMC arrangements and show that just modifying the memory array structure can improve the EDP by up to 8 times.
Subjects
bit-parallel computing | bit-serial computing | CACTI | convolution | In-memory computing | memory array
