Abstract:
Deep Neural Networks (DNNs) have achieved remarkable success across various machine learning and computer vision tasks, especially when abundant training samples are available. In Convolutional Neural Network (CNN) research, it has been established that a model’s generalization capability improves with the combination of complex architectures, strong regularization, domain- specific loss functions, and extensive databases. However, training DNNs in environments with limited data remains a significant challenge, calling for attention from the research community. Many applications lack the requisite volume of data needed to train models effectively. Data constraint in this context is influenced by factors such as 1) a scarcity of domain experts, 2) long-tail distribution in large datasets, 3) insufficient domain-specific data, and 4) the challenge of mimicking human cognition and learning. The issues above are common challenges encountered while designing deep models, underscoring the importance of addressing Data Constrained Learning (DCL). This thesis investigates the formulation of deep learning strategies explicitly tailored for scenarios with DCL. The objective is to ensure that the training of numerous parameters does not adversely affect the model’s ability to learn meaningful patterns, as this could elevate the risk of overfitting and result in suboptimal generalization performance. To address the DCL challenge, we introduce a novel strength parameter in deep learning named SSF-CNN, which concentrates on learning both the "structure" and "strength" of filters. The filter structure is initialized using a dictionary-based filter learning algorithm, while the strength is learned under data-constrained settings. This architecture demonstrates adaptability, delivering robust performance even when used with small databases and consistently attaining high accuracy. We validate the effectiveness of our algorithm on databases such as MNIST, CIFAR10, and NORB, with varying training sample sizes. The results indicate that SSF-CNN substantially reduces the required training parameters while maintaining high test accuracy. Our approach achieves state- of-the-art results for real-world data-constrained problems such as newborn face recognition and the Omniglot dataset. Notably, on the IIITD Newborn Face Database, our method enhances rank-1 identification accuracy by at least 10 In our second contribution, we propose Guided Dropout, a novel regularization technique tailored for the DCL problem, enhancing the traditional Dropout method often used in deep neural networks to mitigate overfitting. Standard Dropout randomly drops nodes from a Neural Network (NN) during training. In contrast, the proposed Guided Dropout strategically selects nodes to drop, leading to better generalization than its traditional counterpart. We also establish that conventional Dropout is a specific instance of the proposed Guided Dropout. Through extensive experimentation on multiple datasets, including MNIST, CIFAR10, CIFAR100, SVHN, and Tiny ImageNet, we demonstrate the superior performance of Guided Dropout. Our third contribution addresses challenges in zero-shot and generalized zero-shot learning (ZSL and GZSL), where, for a class, few or no samples are present in the training set, and only class attributes are known. The performance of many supervised DNN algorithms deteriorates in ZSL settings, highlighting the necessity for model generalization while learning the mapping from class to attribute. To address this, we modify the input and feature space across the deep learning pipeline. Furthermore, to ensure robust performance on both seen and unseen classes, we introduce an Over-Complete Distribution (OCD) generated using a Conditional Variational Autoencoder (CVAE). On this generated OCD, the proposed Online Batch Triplet Loss (OBTL) and Center Loss (CL) work to enhance class separability and reduce intra-class variance, improving performance in ZSL/GZSL scenarios across various benchmark databases. In our fourth contribution, we focus on model space, particularly in CNN models, to address training challenges in data-constrained environments, which typically require millions of parameters. Reducing the number of parameters may compromise model performance. To address this, we introduce Guided DropBlock and Filter Augmentation for resource-constrained deep learning scenarios. Guided DropBlock is inspired by guided Dropout and the DropBlock regularization methods. Unlike its predecessor, which randomly omits a contiguous image segment, the proposed approach is more selective, focusing the omission on the background and specific blocks that carry critical semantic information about the objects in question. On the other hand, the filter augmentation technique we propose involves performing a series of operations on the Convolutional Neural Network (CNN) filters during the training phase. Our findings indicate that integrating filter augmentation while fine-tuning the CNN model can substantially enhance performance in data-limited situations. This approach results in a smoother decision boundary and behavior resembling an ensemble model. Imposing these additional constraints on loss optimization helps mitigate the challenges posed by data scarcity, ensuring robust feature extraction from the input signal, even when some learnable parameters within the CNN layers are frozen. We have validated these enhancements on seven publicly accessible benchmark datasets and two real-world use cases, namely, identifying newborns and monitoring post-cataract surgery conditions, providing empirical support for our claims.