VeriBench: benchmarking large language models for Verilog code generation and design synthesis

Show simple item record

dc.contributor.author Agarwal, Mihir
dc.contributor.author Momin, Zaqi
dc.contributor.author Prasad, Kailash
dc.contributor.author Mekie, Joycee
dc.coverage.spatial United Kingdom
dc.date.accessioned 2025-07-16T10:50:15Z
dc.date.available 2025-07-16T10:50:15Z
dc.date.issued 2025-05-25
dc.identifier.citation Agarwal, Mihir; Momin, Zaqi; Prasad, Kailash and Mekie, Joycee, "VeriBench: benchmarking large language models for Verilog code generation and design synthesis", in the IEEE International Symposium on Circuits and Systems (ISCAS 2025), London, UK, May 25-28, 2025.
dc.identifier.uri https://doi.org/10.1109/ISCAS56072.2025.11044004
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/11648
dc.description.abstract In the rapidly advancing field of hardware design, Electronic Design Automation (EDA) tools can be significantly improved using Machine Learning. This study evaluates the efficacy of various Large Language Models (LLMs) for automating Electronic Design Automation for Verilog design, testbench generation, and Formal Verification (FV) assertion synthesis by comparing 3 closed-source LLMs and 14 Open-Source LLM variants. In our setup of 33 Verilog designs, ChatGPT-4 generates 22 synthesizable Verilog designs in one-shot without feedback, while the Llama 3 (8B) model generates 20. Both models generate all testbenches correctly, 9 of which are given in our setup. For generating Formal Verification properties, ChatGPT-4 generates all properties correctly, whereas Llama 3 synthesizes 7 out of 9 properties correctly. Of the sample synthesized in Vivado, ChatGPT-4 codes result into power-efficient designs as compared to Llama-3, whereas in Genus there is no clear winner. These results underscore the efficacy of open-source models, which perform competitively despite having significantly fewer parameters (8 billion) compared to closed-source models such as ChatGPT-4. This study demonstrates the potential of parameter-efficient, open-source models for hardware design and verification tasks.
dc.description.statementofresponsibility by Mihir Agarwal, Zaqi Momin, Kailash Prasad and Joycee Mekie
dc.language.iso en_US
dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
dc.subject Large Language Models
dc.subject AI-enabled hardware design automation
dc.subject Verilog generation
dc.subject Testbech
dc.subject Formal verification
dc.subject Fine-tuning
dc.subject Few-shot prompting
dc.title VeriBench: benchmarking large language models for Verilog code generation and design synthesis
dc.type Conference Paper
dc.relation.journal IEEE International Symposium on Circuits and Systems (ISCAS 2025)


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account