Safe reinforcement learning

Singh, Divyajeet; Kaul, Sanjit K (Advisor)

Please use this identifier to cite or link to this item: http://repository.iiitd.edu.in/xmlui/handle/123456789/1911

Full metadata record

DC Field	Value	Language
dc.contributor.author	Singh, Divyajeet	-
dc.contributor.author	Kaul, Sanjit K (Advisor)	-
dc.date.accessioned	2026-04-17T10:21:00Z	-
dc.date.available	2026-04-17T10:21:00Z	-
dc.date.issued	2024-11-28	-
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/1911	-
dc.description.abstract	Reinforcement Learning (RL) has gained significant traction as a powerful paradigm for solving sequential decision-making problems. This includes exceptional progress in autonomous vehicles [11, 17], algorithmic trading (finance) [5], and gaming engines [13, 14]. However, training and deploying RL-based agents in real-world scenarios often requires addressing safety constraints, as failure to adhere to these constraints can lead to catastrophic consequences. For example, while learning to hover a helicopter over a specific area, choosing a series of ‘bad’ actions may cause it to crash. Safe Reinforcement Learning (Safe RL) is a promising model that aims to produce agents that operate within predefined safety bounds while optimizing performance. This report lays down in detail different formulations of the Safe RL problem, methods of quantifying safety in reinforcement learning, and their tractable solutions that perform optimally while adhering to their predefined safety constraints. Moreover, we explore in depth the strategies of understanding and formalizing the methods of proving performance bounds on RL algorithms. With the findings of this project, we aim to contribute to advancing the theoretical and practical understanding of Safe RL, paving the way for its adoption in high-stakes domains requiring robust decision-making under constraints, e.g. in the domain of safe autonomous driving.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Reinforcement Learning	en_US
dc.subject	Markov Decision Processes	en_US
dc.subject	Safety Constraints	en_US
dc.title	Safe reinforcement learning	en_US
dc.type	Other	en_US
Appears in Collections:	Year-2024

Files in This Item:

File	Description	Size	Format
2021529_DivyajeetSingh_BTP-Report - Divyajeet Singh.pdf Restricted Access		387.8 kB	Adobe PDF	View/Open Request a copy

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets