“selu” Activation Function and 93 Pages of Appendix

A preprint on arXiv recently caught a lot of attentions. While deep learning is successful in various types of neural networks, it had not been so for feed-forward neural networks. The authors of this paper proposed normalizing the network with a new activation function, called “selu” (scaled exponential linear units):

\text{selu}(x) =\lambda \left\{ \begin{array}{cc} x & \text{if } x>0  \\ \alpha e^x - \alpha & \text{if } x \leq 0  \end{array}  \right..

which is an improvement to the existing “elu” function.

Despite this achievement, what caught the eyeballs is not the activation function, but the 93-page appendix of mathematical proof:

lol

And this is one of the pages in the appendix:

longproof

Some scholars teased at it on Twitter too:

  • Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter, “Self-Normalizing Neural Networks,” arXiv:1706.02515 (2017). [arXiv]
  • Github: bioinf-jku/SNNs. [Github]
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s