A study on smooth activation functions

Biswas, Koushik; Pandey, Ashish Kumar (Advisor); Banerjee, Shilpak (Advisor)

Home
→
Computer Science and Engineering
→
PhD Theses
→
Year-2023
→
View Item

dc.contributor.author	Biswas, Koushik
dc.contributor.author	Pandey, Ashish Kumar (Advisor)
dc.contributor.author	Banerjee, Shilpak (Advisor)
dc.date.accessioned	2023-09-06T06:19:49Z
dc.date.available	2023-09-06T06:19:49Z
dc.date.issued	2023-07
dc.identifier.uri	http://repository.iiitd.edu.in/xmlui/handle/123456789/1306
dc.description.abstract	Artificial neural networks (ANNs) have occupied the centre stage in deep learning. An activation function is a crucial component in the neural network, which introduces the non-linearity in the network. An activation function is considered good if it can generalise better on a variety of datasets, ensure faster convergence and improve neural network performance. The Rectified Linear Unit (ReLU) has emerged as the most popular activation function due to its simplicity though it has some drawbacks. To overcome the shortcomings of ReLU (non-smooth, non-zero mean, negative missing, to name a few), and to increase the accuracy considerably in a variety of tasks, many new activation functions have been proposed over the years like Leaky ReLU, ELU, Softplus, Parametric ReLU, ReLU6 etc. However, all of them provides marginal improvement over ReLU. Swish, GELU, Padé activation unit (PAU), and Mish are some non-linear smooth activations proposed recently which show good improvement over ReLU in a variety of deep learning tasks. ReLU or its variants are non-smooth (continuous but not differentiable) at the origin though smoothness is an important property during backpropagation. We construct several smooth activation functions, which are approximation by a smooth function of ReLU, Leaky ReLU or its variants. Some of these functions are hand-engineered, while some come from underline mathematical theory. All these functions have shown good improvement over ReLU or Swish in the variety of standard datasets in different deep learning problems like image classification, object detection, semantic segmentation, and machine translation.	en_US
dc.language.iso	en_US	en_US
dc.publisher	IIIT-Delhi	en_US
dc.subject	Smooth activation function	en_US
dc.subject	Artificial neural network	en_US
dc.subject	Deep learning	en_US
dc.title	A study on smooth activation functions	en_US
dc.type	Thesis	en_US