Abstract Activation functions play important roles in determining the depth and non-linearity of deep learning models. Since the Rectified Linear Unit (ReLU) was introduced, many modifications, in which noise is intentionally injected, have been proposed to avoid overfitting. Exponential Linear Unit (ELU) and their variants, with trainable parameters, have been proposed to reduce the bias shift effects which are often observed in ReLU-type activation functions. In this paper, we propose a novel activation function, called the Elastic Exponential Linear Unit (EELU), which combines the advantages of both types of activation functions in a generalized form. EELU has an elastic slope in the positive part, and preserves the negative signal by using a small non-zero gradient. We also present a new strategy to insert neuronal noise using a Gaussian distribution in the activation function to improve generalization. We demonstrated how EELU can represent a wider variety of features with random noise than other activation functions, by visualizing the latent features of convolutional neural networks. We evaluated the effectiveness of the EELU approach through extensive experiments with image classification using the CIFAR-10/CIFAR-100, ImageNet, and Tiny ImageNet datasets. Our experimental results show that EELU achieved better generalization performance and improved classification accuracy over conventional activation functions, such as ReLU, ELU, ReLU- and ELU-like variants, Scaled ELU, and Swish. EELU produced performance improvements in image classification using a smaller number of training samples, owing to its noise injection strategy, which allows significant variation in function outputs, including deactivation.
中文翻译:
摘要 激活函数在确定深度学习模型的深度和非线性方面起着重要作用。自从引入整流线性单元 (ReLU) 以来,已经提出了许多有意注入噪声的修改,以避免过度拟合。已提出具有可训练参数的指数线性单元 (ELU) 及其变体,以减少经常在 ReLU 类型激活函数中观察到的偏置偏移效应。在本文中,我们提出了一种新的激活函数,称为弹性指数线性单元 (EELU),它以广义形式结合了两种激活函数的优点。EELU 在正部分具有弹性斜率,并通过使用小的非零梯度保留负信号。我们还提出了一种在激活函数中使用高斯分布插入神经元噪声的新策略,以提高泛化能力。通过可视化卷积神经网络的潜在特征,我们展示了 EELU 如何用随机噪声表示比其他激活函数更广泛的特征。我们通过使用 CIFAR-10/CIFAR-100、ImageNet 和 Tiny ImageNet 数据集的大量图像分类实验评估了 EELU 方法的有效性。我们的实验结果表明,与 ReLU、ELU、ReLU 和 ELU 类变体、Scaled ELU 和 Swish 等传统激活函数相比,EELU 实现了更好的泛化性能和更高的分类精度。EELU 使用较少数量的训练样本提高了图像分类的性能,