Abstract:
Deep learning systems require a large amount of labelled training dataset. However large
amount of labelled data is not available in many cases as it requires considerable human
effort to label each sample correctly. In many cases like medical imaging, there is a small
amount of labelled dataset along with large amount of unlabelled samples. In this research,
we implement an Active learning algorithm which can help in increasing performance of deep
learning models by using large amount of available unlabelled dataset. We propose a novel
Active learning algorithm (Triplet AL) which uses a triplet network to select samples from
unlabelled set for training classification model. Past active learning methods rely on classification model's final prediction scores as a measure of confidence for an unlabelled sample.
We propose a more reliable confidence measure called Top-Two-Margin which is given by
Triplet Network. We used STL-10 and CIFAR-10 dataset to test proposed algorithm. To
test architectural independence of proposed algorithm, we tested proposed algorithm by
using different model architectures for classification model. We compared results obtained
using proposed method with past active learning methods. Proposed algorithm outperforms
other active learning approaches we used to compare in our research.