Recently I have been drawn to generative models, such as LDA (latent Dirichlet allocation) and other topic models. In deep learning, there are a few examples, such as FVBN (fully visible belief networks), VAE (variational autoencoder), RBM (restricted Boltzmann machine) etc. Recently I have been reading about GAN (generative adversarial networks), first published by Ian Goodfellow and his colleagues and collaborators. Goodfellow published his talk in NIPS 2016 on arXiv recently.
In GAN, there are two important functions, namely, the discriminator (D), and the generator (G). As a generative model, the distribution of training data, all labeled positive, can be thought of the distribution that the generator was trained to produce. The discriminator discriminates the data with positive labels and those with negative labels. Then the generator tries to generate data, probably from noises, which should be negative, to fake the discriminator to see it as positive. This process repeats iteratively, and eventually the generator is trained to produce data that are close to the distribution of the training data, and the discriminator will be confused to classify the generated data as positive with probability . The intuition of this competitive game is from minimax game in game theory. The formal algorithm is described in the original paper as follow:
The original paper discussed about that the distribution of final generated data identical to that of the training data being the optimal for the model, and argued using the Jensen-Shannon (JS) divergence. Ferenc Huszár discussed in his blog about the relations between maximum likelihood, Kullback-Leibler (KL) divergence, and Jensen-Shannon (JS) divergence.
I have asked the speaker a few questions about the concepts of GAN as well.
GAN is not yet a very sophisticated framework, but it already found a few industrial use. Some of its descendants include LapGAN (Laplacian GAN), and DCGAN (deep convolutional GAN). Applications include voice generation, image super-resolution, pix2pix (image-to-image translation), text-to-image synthesis, iGAN (interactive GAN) etc.
“Adversarial training is the coolest thing since sliced bread.” – Yann LeCun
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Networks,” arXiv:1406.2661 (2014). [arXiv]
- Ian Goodfellow, “NIPS 2016 Tutorial: Generative Adversarial Networks,” arXiv:1701.00160 (2017). [arXiv]
- Ferenc Huszár, “How to Train your Generative Models? And why does Adversarial Training work so well?” inFERENCe (November 2015). [inFERENCe]
- “Generative Adversarial Networks, An Introduction,” Data Science DC. [Meetup] The presentation material of Jennifer Sleeman can be found in this Github repository: jennsleeman/introtogans_dcdatascience_2017.
- Emily Denton, Soumith Chintala, Arthur Szlam, Rob Fergus, “Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks,” arXiv:1506.05751 (2015). [arXiv]
- Alec Radford, Luke Metz, Soumith Chintala, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks,” arXiv:1511.06434 (2015). [arXiv]
- Kwan-Yuet Ho, “Generative-Discriminative Pair,” Everything About Data Analytics, WordPress (2016). [WordPress]