Word Mover’s Distance as a Linear Programming Problem

Much about the use of word-embedding models such as Word2Vec and GloVe have been covered. However, how to measure the similarity between phrases or documents? One natural choice is the cosine similarity, as I have toyed with in a previous post. However, it smoothed out the influence of each word. Two years ago, a group … More Word Mover’s Distance as a Linear Programming Problem

Short Text Mining using Advanced Keras Layers and Maxent: shorttext 0.4.1

On 07/28/2017, shorttext published its release 0.4.1, with a few important updates. To install it, type the following in the OS X / Linux command line: >>> pip install -U shorttext The documentation in PythonHosted.org has been abandoned. It has been migrated to readthedocs.org. (URL: http://shorttext.readthedocs.io/ or http:// shorttext.rtfd.io) Exploiting the Word-Embedding Layer This update is mainly due … More Short Text Mining using Advanced Keras Layers and Maxent: shorttext 0.4.1

“selu” Activation Function and 93 Pages of Appendix

A preprint on arXiv recently caught a lot of attentions. While deep learning is successful in various types of neural networks, it had not been so for feed-forward neural networks. The authors of this paper proposed normalizing the network with a new activation function, called “selu” (scaled exponential linear units): . which is an improvement … More “selu” Activation Function and 93 Pages of Appendix

Sammon Embedding with Tensorflow

Embedding algorithms, especially word-embedding algorithms, have been one of the recurrent themes of this blog. Word2Vec has been mentioned in a few entries (see this); LDA2Vec has been covered (see this); the mathematical principle of GloVe has been elaborated (see this); I haven’t even covered Facebook’s fasttext; and I have not explained the widely used … More Sammon Embedding with Tensorflow

ConvNet Seq2seq for Machine Translation

In these few days, Facebook published a new research paper, regarding the use of sequence to sequence (seq2seq) model for machine translation. What is special about this seq2seq model is that it uses convolutional neural networks (ConvNet, or CNN), instead of recurrent neural networks (RNN). The original seq2seq model is implemented with Long Short-Term Memory … More ConvNet Seq2seq for Machine Translation