Abstract:
There are various domains under which Machine Learning has been successfully applied. With
the availability of high computing resources, tasks such as image classification, object detection
and others are easily tackled by learning algorithms. However, The field of deep learning, and
neural networks in particular are vulnerable to adversarial samples. These adversarial attacks
are able to severely affect the performance of a deep learning system. The adversarial samples
are usually very similar to the original samples and thus are indistinguishable to humans, however they are able to fool the deep learning systems leading to an incorrect classification.
In this report, we comprehensively summarize the recent progress in the field of adversarial
machine learning. Specifically, we give an overview of various attacking methods that currently
are state of the art while also reviewing various defense mechanisms. We also conduct our own
experiments that show how state of the art models such as ResNet are also prone to adversarial
attacks.
Further, We also propose an end-to-end model that could successfully mitigate the adversarial
attacks on architectures that specifically solve for object classification tasks. We call this model
as the ”Mitigator” since it helps an already trained classifier in mitigating the attack. The
”Mitigator” has the following properties:
• The Mitigator should be invariant from the complexity of the attack
• Robust from any kind of attacks that change the pixel values of an Image
• Can be attached in front of any classifier that takes an image as its input
We further experiment with mixing an ensemble technique with the above approach. This leads
us to train many more models (9 of them in total). All of these 9 models are used in succession
and voting mechanisms are used for final classification. The testing is done against 2 popular
attacks: Deepfool and Carlini L2. In the end, we showcase the improvement in results and
discuss the Pros and Cons of this approach