Abstract:
Resource allocation is a problem that requires complex decision making. Our focus is to solve this problem in the healthcare sector using Reinforcement Learning. We propose an RL pipeline that starts with a sequential decision deep RL model and combines it with a Contextual Bandit approach. The Reinforcement Learning models suggest actions and rewards whereas the Contextual Bandits allows for dynamic change in context so that allocation can take place in real-world scenarios too, where the environment is not static. We have also used mathematical randomised optimisation models to compare the results received by the RL models. Our goal is to make Reinforcement Learning models Plug and Playable through a platform so that the user can use these models to solve their Resource Allocation Problem without knowing Reinforcement Learning.