IIIT-Delhi Institutional Repository

Predicting deep learning architecture

Show simple item record

dc.contributor.author Chauhan, Arushi
dc.contributor.author Vatsa, Mayank (Advisor)
dc.contributor.author Singh, Richa (Advisor)
dc.date.accessioned 2021-05-21T09:39:50Z
dc.date.available 2021-05-21T09:39:50Z
dc.date.issued 2020-06-02
dc.identifier.uri http://repository.iiitd.edu.in/xmlui/handle/123456789/893
dc.description.abstract Convolutional neural networks are being extensively used in real world applications like image and video classification, natural language processing, medical image analysis, recommender systems etc. where previously machine learning algorithms and hand-crafted approaches were used. Some of the well known CNNs are LeNet5, AlexNet, VGG, GoogleNet, ResNet, DenseNet etc. Over the years, the architecture of CNNs is successively becoming deeper (using more layers) and more complex with the introduction of new types of layers. These CNNs are designed by experts who have rich domain knowledge of both datasets and CNNs. There is a great demand for algorithms which can build CNNs with the best architecture that can work for a given dataset. This would reduce the dependence on researchers to use hand-crafted networks for every new task. The algorithms should also be able to work with small datasets and use efficient computation techniques to use GPU effectively. This project explores the various techniques used in neural architecture search, and works on developing models to predict the performance of the CNNs. We propose the Network Epoch Accuracy Prediction Framework (NEAP-F) which predicts the accuracy achieved (in an epoch) of a sample network on an image dataset, and a dataset of network architecture training curves on image datasets CIFAR-10 and MNIST. In the current scenario where fast and efficient computing is the norm, NEAP-F reduces the resources required in neural architecture search systems by eliminating the need to train architectures, which is major bottleneck, thus, cutting down computation time and resources needed to examine architectures heavily; current computation time of NEAP-F is in the order of miliseconds. The dataset released augments existing datasets with networks having residual connections to reflect the state-of-the-art architectures. The results have been computed on dataset consisting of 50670 data points for prediction, distributed across 3 image datasets - CIFAR-10, SVHN and MNIST. en_US
dc.language.iso other en_US
dc.publisher IIIT-Delhi en_US
dc.subject convolutional neural networks, datasets, genetic algorithm, memetic algorithm, cross entropy loss, vector representation, performance prediction, ResNet, architecture search, sequence-to-sequence model, dataset, description, regression models, epoch prediction, image datasets, architecture vector representation, dataset vector representation, ”ease of classifying” dataset en_US
dc.title Predicting deep learning architecture en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository


Advanced Search

Browse

My Account