dc.contributor.author |
Agarwal, Mihir |
|
dc.contributor.author |
Momin, Zaqi |
|
dc.contributor.author |
Prasad, Kailash |
|
dc.contributor.author |
Mekie, Joycee |
|
dc.coverage.spatial |
United Kingdom |
|
dc.date.accessioned |
2025-07-16T10:50:15Z |
|
dc.date.available |
2025-07-16T10:50:15Z |
|
dc.date.issued |
2025-05-25 |
|
dc.identifier.citation |
Agarwal, Mihir; Momin, Zaqi; Prasad, Kailash and Mekie, Joycee, "VeriBench: benchmarking large language models for Verilog code generation and design synthesis", in the IEEE International Symposium on Circuits and Systems (ISCAS 2025), London, UK, May 25-28, 2025. |
|
dc.identifier.uri |
https://doi.org/10.1109/ISCAS56072.2025.11044004 |
|
dc.identifier.uri |
https://repository.iitgn.ac.in/handle/123456789/11648 |
|
dc.description.abstract |
In the rapidly advancing field of hardware design, Electronic Design Automation (EDA) tools can be significantly improved using Machine Learning. This study evaluates the efficacy of various Large Language Models (LLMs) for automating Electronic Design Automation for Verilog design, testbench generation, and Formal Verification (FV) assertion synthesis by comparing 3 closed-source LLMs and 14 Open-Source LLM variants. In our setup of 33 Verilog designs, ChatGPT-4 generates 22 synthesizable Verilog designs in one-shot without feedback, while the Llama 3 (8B) model generates 20. Both models generate all testbenches correctly, 9 of which are given in our setup. For generating Formal Verification properties, ChatGPT-4 generates all properties correctly, whereas Llama 3 synthesizes 7 out of 9 properties correctly. Of the sample synthesized in Vivado, ChatGPT-4 codes result into power-efficient designs as compared to Llama-3, whereas in Genus there is no clear winner. These results underscore the efficacy of open-source models, which perform competitively despite having significantly fewer parameters (8 billion) compared to closed-source models such as ChatGPT-4. This study demonstrates the potential of parameter-efficient, open-source models for hardware design and verification tasks. |
|
dc.description.statementofresponsibility |
by Mihir Agarwal, Zaqi Momin, Kailash Prasad and Joycee Mekie |
|
dc.language.iso |
en_US |
|
dc.publisher |
Institute of Electrical and Electronics Engineers (IEEE) |
|
dc.subject |
Large Language Models |
|
dc.subject |
AI-enabled hardware design automation |
|
dc.subject |
Verilog generation |
|
dc.subject |
Testbech |
|
dc.subject |
Formal verification |
|
dc.subject |
Fine-tuning |
|
dc.subject |
Few-shot prompting |
|
dc.title |
VeriBench: benchmarking large language models for Verilog code generation and design synthesis |
|
dc.type |
Conference Paper |
|
dc.relation.journal |
IEEE International Symposium on Circuits and Systems (ISCAS 2025) |
|