Abstract:
Recent work by Tramer et al. [6] highlighted online models’ vulnerability to theft by exploiting prediction APIs through repetitive querying. Since then, numerous studies have emphasized the increasing significance of model extraction attacks as a potent threat to intellectual property. These attacks have prompted the research community to explore and develop new, efficient algorithms to facilitate the unauthorized extraction of valuable models. In response, researchers and practitioners have devised proactive and reactive defense strategies to mitigate these vulnerabilities. Given the escalating risks posed by model extraction attacks, it is imperative to investigate further and evaluate the effectiveness of these countermeasures. This project aims to develop a toolbox that allows the model owner to check the safety of a deployed model. We provide tools for performing model-stealing attacks on a trained model and generate a comprehensive report about the model’s susceptibility to various attacks.