Predicting deep learning architecture

Chauhan, Arushi; Vatsa, Mayank (Advisor); Singh, Richa (Advisor)

dc.contributor.author	Chauhan, Arushi
dc.contributor.author	Vatsa, Mayank (Advisor)
dc.contributor.author	Singh, Richa (Advisor)
dc.date.accessioned	2021-05-21T09:39:50Z
dc.date.available	2021-05-21T09:39:50Z
dc.date.issued	2020-06-02
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/893
dc.description.abstract	Convolutional neural networks are being extensively used in real world applications like image and video classification, natural language processing, medical image analysis, recommender systems etc. where previously machine learning algorithms and hand-crafted approaches were used. Some of the well known CNNs are LeNet5, AlexNet, VGG, GoogleNet, ResNet, DenseNet etc. Over the years, the architecture of CNNs is successively becoming deeper (using more layers) and more complex with the introduction of new types of layers. These CNNs are designed by experts who have rich domain knowledge of both datasets and CNNs. There is a great demand for algorithms which can build CNNs with the best architecture that can work for a given dataset. This would reduce the dependence on researchers to use hand-crafted networks for every new task. The algorithms should also be able to work with small datasets and use efficient computation techniques to use GPU effectively. This project explores the various techniques used in neural architecture search, and works on developing models to predict the performance of the CNNs. We propose the Network Epoch Accuracy Prediction Framework (NEAP-F) which predicts the accuracy achieved (in an epoch) of a sample network on an image dataset, and a dataset of network architecture training curves on image datasets CIFAR-10 and MNIST. In the current scenario where fast and efficient computing is the norm, NEAP-F reduces the resources required in neural architecture search systems by eliminating the need to train architectures, which is major bottleneck, thus, cutting down computation time and resources needed to examine architectures heavily; current computation time of NEAP-F is in the order of miliseconds. The dataset released augments existing datasets with networks having residual connections to reflect the state-of-the-art architectures. The results have been computed on dataset consisting of 50670 data points for prediction, distributed across 3 image datasets - CIFAR-10, SVHN and MNIST.	en_US
dc.language.iso	other	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	convolutional neural networks, datasets, genetic algorithm, memetic algorithm, cross entropy loss, vector representation, performance prediction, ResNet, architecture search, sequence-to-sequence model, dataset, description, regression models, epoch prediction, image datasets, architecture vector representation, dataset vector representation, ”ease of classifying” dataset	en_US
dc.title	Predicting deep learning architecture	en_US
dc.type	Other	en_US