“selu” Activation Function and 93 Pages of Appendix

A preprint on arXiv recently caught a lot of attentions. While deep learning is successful in various types of neural networks, it had not been so for feed-forward neural networks. The authors of this paper proposed normalizing the network with a new activation function, called “selu” (scaled exponential linear units): . which is an improvement … More “selu” Activation Function and 93 Pages of Appendix

ConvNet Seq2seq for Machine Translation

In these few days, Facebook published a new research paper, regarding the use of sequence to sequence (seq2seq) model for machine translation. What is special about this seq2seq model is that it uses convolutional neural networks (ConvNet, or CNN), instead of recurrent neural networks (RNN). The original seq2seq model is implemented with Long Short-Term Memory … More ConvNet Seq2seq for Machine Translation

On Wasserstein GAN

A few weeks ago, I introduced the generative model called generative adversarial networks (GAN), and stated the difficulties of training it. Not long after the post, a group of scientists from Facebook and Courant introduced Wasserstein GAN, which uses Wasserstein distance, or the Earth Mover (EM) distance, instead of Jensen-Shannon (JS) divergence as the final … More On Wasserstein GAN

Generative Adversarial Networks

Recently I have been drawn to generative models, such as LDA (latent Dirichlet allocation) and other topic models.┬áIn deep learning, there are a few examples, such as FVBN (fully visible belief networks), VAE (variational autoencoder), RBM (restricted Boltzmann machine) etc. Recently I have been reading about GAN (generative adversarial networks), first published by Ian Goodfellow … More Generative Adversarial Networks

Tensor Networks and Density Matrix Renormalization Group

A while ago, Mehta and Schwab drew a connection between Restricted Boltzmann Machine (RBM), a type of deep learning algorithm, and renormalization group (RG), a theoretical tool in physics applied on critical phenomena. [Mehta & Schwab, 2014; see previous entry] Can RG be able to relate to other deep leaning algorithms? Schwab wrote a paper … More Tensor Networks and Density Matrix Renormalization Group