VeriBench: benchmarking large language models for Verilog code generation and design synthesis

dc.contributor.author	Agarwal, Mihir
dc.contributor.author	Momin, Zaqi
dc.contributor.author	Prasad, Kailash
dc.contributor.author	Mekie, Joycee
dc.coverage.spatial	United Kingdom
dc.date.accessioned	2025-07-16T10:50:15Z
dc.date.available	2025-07-16T10:50:15Z
dc.date.issued	2025-05-25
dc.identifier.citation	Agarwal, Mihir; Momin, Zaqi; Prasad, Kailash and Mekie, Joycee, "VeriBench: benchmarking large language models for Verilog code generation and design synthesis", in the IEEE International Symposium on Circuits and Systems (ISCAS 2025), London, UK, May 25-28, 2025.
dc.identifier.uri	https://doi.org/10.1109/ISCAS56072.2025.11044004
dc.identifier.uri	https://repository.iitgn.ac.in/handle/123456789/11648
dc.description.abstract	In the rapidly advancing field of hardware design, Electronic Design Automation (EDA) tools can be significantly improved using Machine Learning. This study evaluates the efficacy of various Large Language Models (LLMs) for automating Electronic Design Automation for Verilog design, testbench generation, and Formal Verification (FV) assertion synthesis by comparing 3 closed-source LLMs and 14 Open-Source LLM variants. In our setup of 33 Verilog designs, ChatGPT-4 generates 22 synthesizable Verilog designs in one-shot without feedback, while the Llama 3 (8B) model generates 20. Both models generate all testbenches correctly, 9 of which are given in our setup. For generating Formal Verification properties, ChatGPT-4 generates all properties correctly, whereas Llama 3 synthesizes 7 out of 9 properties correctly. Of the sample synthesized in Vivado, ChatGPT-4 codes result into power-efficient designs as compared to Llama-3, whereas in Genus there is no clear winner. These results underscore the efficacy of open-source models, which perform competitively despite having significantly fewer parameters (8 billion) compared to closed-source models such as ChatGPT-4. This study demonstrates the potential of parameter-efficient, open-source models for hardware design and verification tasks.
dc.description.statementofresponsibility	by Mihir Agarwal, Zaqi Momin, Kailash Prasad and Joycee Mekie
dc.language.iso	en_US
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.subject	Large Language Models
dc.subject	AI-enabled hardware design automation
dc.subject	Verilog generation
dc.subject	Testbech
dc.subject	Formal verification
dc.subject	Fine-tuning
dc.subject	Few-shot prompting
dc.title	VeriBench: benchmarking large language models for Verilog code generation and design synthesis
dc.type	Conference Paper
dc.relation.journal	IEEE International Symposium on Circuits and Systems (ISCAS 2025)

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Conference Papers [542]

Show simple item record

Search Digital Repository

Browse

All of DSpace
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Degree
- By Department

VeriBench: benchmarking large language models for Verilog code generation and design synthesis

Files in this item

This item appears in the following Collection(s)

Search Digital Repository

Browse

All of DSpace

This Collection

My Account