ARiA: Utilizing Richard's Curve for Controlling the Non-monotonicity of
the Activation Function in Deep Neural Nets
release_qceyipwobzd2tfo5eiujy6qska
by
Narendra Patwardhan, Madhura Ingalhalikar, Rahee Walambe
2018
Abstract
This work introduces a novel activation unit that can be efficiently employed
in deep neural nets (DNNs) and performs significantly better than the
traditional Rectified Linear Units (ReLU). The function developed is a two
parameter version of the specialized Richard's Curve and we call it Adaptive
Richard's Curve weighted Activation (ARiA). This function is non-monotonous,
analogous to the newly introduced Swish, however allows a precise control over
its non-monotonous convexity by varying the hyper-parameters. We first
demonstrate the mathematical significance of the two parameter ARiA followed by
its application to benchmark problems such as MNIST, CIFAR-10 and CIFAR-100,
where we compare the performance with ReLU and Swish units. Our results
illustrate a significantly superior performance on all these datasets, making
ARiA a potential replacement for ReLU and other activations in DNNs.
In text/plain
format
Archived Files and Locations
application/pdf 1.1 MB
file_tcmpfzxi75hitlftfurtsowfh4
|
arxiv.org (repository) web.archive.org (webarchive) |
1805.08878v1
access all versions, variants, and formats of this works (eg, pre-prints)