Understanding robustness of vision transformers

Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1585

Full metadata record

DC Field	Value	Language
dc.contributor.author	Sinha, Arya	-
dc.contributor.author	Subramanyam, A V (Advisor)	-
dc.date.accessioned	2024-05-24T05:19:39Z	-
dc.date.available	2024-05-24T05:19:39Z	-
dc.date.issued	2023-11-29	-
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/1585	-
dc.description.abstract	In this study, we delve into the realm of attention-based networks, particularly the recent advancements of Vision Transformers (ViT) that outperform conventional Convolutional Neural Networks (CNNs) in numerous vision tasks. However, since ViT has a different architecture than CNN, its behavavior may vary. The differences in robustness between ViTs and CNNs and the underlying reasons for these differences are studied. To investigate the reliability of ViT, this study analyzes the vulnerabilities of ViTs to adversarial samples. To enhance the robustness of ViT, a range of training and modifications in architecture and patch embedding mechanism were explored. A distinctive adversarial sample generation technique tailored for ViT architecture is introduced.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Adversarial Robustness	en_US
dc.subject	Adversarial Training	en_US
dc.subject	Computer Vision	en_US
dc.subject	Vision Transformer	en_US
dc.title	Understanding robustness of vision transformers	en_US
dc.type	Other	en_US
Appears in Collections:	Year-2023

Files in This Item:

File	Description	Size	Format
BTP_Report - Arya Sinha.pdf Restricted Access		654.71 kB	Adobe PDF	View/Open Request a copy

DSpace JSPUI