HDIB1M - Handwritten document image binarization 1 million dataset

Show simple item record

dc.contributor.author Sadekar, Kaustubh
dc.contributor.author Singh, Prajwal
dc.contributor.author Raman, Shanmuganathan
dc.date.accessioned 2021-02-05T14:54:03Z
dc.date.available 2021-02-05T14:54:03Z
dc.date.issued 2021-01
dc.identifier.citation Sadekar, Kaustubh; Singh, Prajwal and Raman, Shanmuganathan, "HDIB1M - Handwritten document image binarization 1 million dataset", arXiv, Cornell University Library, DOI: arXiv:2101.11674, Jan. 2021. en_US
dc.identifier.uri http://arxiv.org/abs/2101.11674
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/6268
dc.description.abstract Handwritten document image binarization is a challenging task due to high diversity in the content, page style, and condition of the documents. While the traditional thresholding methods fail to generalize on such challenging scenarios, deep learning based methods can generalize well however, require a large training data. Current datasets for handwritten document image binarization are limited in size and fail to represent several challenging real-world scenarios. To solve this problem, we propose HDIB1M - a handwritten document image binarization dataset of 1M images. We also present a novel method used to generate this dataset. To show the effectiveness of our dataset we train a deep learning model UNetED on our dataset and evaluate its performance on other publicly available datasets. The dataset and the code will be made available to the community.
dc.description.statementofresponsibility by Kaustubh Sadekar, Prajwal Singh and Shanmuganathan Raman
dc.language.iso en_US en_US
dc.publisher Cornell University Library en_US
dc.subject Computer Science en_US
dc.subject Computer Vision en_US
dc.subject Pattern Recognition en_US
dc.title HDIB1M - Handwritten document image binarization 1 million dataset en_US
dc.type Pre-Print en_US
dc.relation.journal arXiv

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


My Account