Activation functions in neural networks

Narayan, Anupam; Pandey, Ashish Kumar (Advisor)

Activation functions in neural networks

Narayan, Anupam; Pandey, Ashish Kumar (Advisor)

URI: http://repository.iiitd.edu.in/xmlui/handle/123456789/1395

Date: 2023-12-12

Abstract:

Artificial neural networks (ANNs) are pivotal in deep learning, with activation functions introducing crucial non-linearity. An ideal activation function should generalize well across datasets, expedite convergence, and enhance network performance. While ReLU is popular, its non-smooth nature and other drawbacks have led to the development of alternatives like Leaky ReLU, ELU, Softplus, Parametric ReLU, and ReLU6, showing only marginal improvements. Recently, smooth activations like Swish, GELU, PAU, and Mish have demonstrated significant enhancements over ReLU. However, addressing the non-smooth origin in backpropagation remains essential. A novel activation function, approximating ReLU, has been formulated through both hand-engineered and mathematical approaches, consistently outperforming ReLU and its variants across standard datasets. This study introduces a novel activation function, a smooth approximation of non-smooth functions like ReLU, tested on CIFAR-10, CIFAR-100, and MNIST. The function's versatility is validated across image classification, object detection, semantic segmentation, and machine translation. The poster also presents two emerging activation functions, offering insights into their design and potential applications. This research contributes valuable tools for improving deep learning model efficiency in diverse domains.

Show full item record