Word2Vec has hit the NLP world for a while, as it is a nice method for word embeddings or word representations. Its use of skip-gram model and deep learning made a big impact too. It has been my favorite toy indeed. However, even though the words do have a correlation across a small segment of text, it is still a local coherence. On the other hand, topic models such as latent Dirichlet allocation (LDA) capture the distribution of words within a topic, and that of topics within a document etc. And it provides a representation of a new document in terms of a topic.
In my previous blog entry, I introduced Chris Moody’s LDA2Vec algorithm (see: his SlideShare). Unfortunately, not many papers or blogs have covered this new algorithm too much despite its potential. The API is not completely well documented yet, although you can see its example from its source code on its Github. In its documentation, it gives an example of deriving topics from an array of random numbers, in its lda2vec/lda2vec.py code:
from lda2vec import LDA2Vec n_words = 10 n_docs = 15 n_hidden = 8 n_topics = 2 n_obs = 300 words = np.random.randint(n_words, size=(n_obs)) _, counts = np.unique(words, return_counts=True) model = LDA2Vec(n_words, n_hidden, counts) model.add_categorical_feature(n_docs, n_topics, name='document id') model.finalize() doc_ids = np.arange(n_obs) % n_docs loss = model.fit_partial(words, 1.0, categorical_features=doc_ids)
A more comprehensive example is in examples/twenty_newsgroup/lda.py .
Besides, LDA2Vec, there are some related research work on topical word embeddings too. A group of Australian and American scientists studied about the topic modeling with pre-trained Word2Vec (or GloVe) before performing LDA. (See: their paper and code) On the other hand, another group with Chinese and Singaporean scientists performs LDA, and then trains a Word2Vec model. (See: their paper and code) And LDA2Vec concatenates the Word2Vec and LDA representation, like an early fusion.
No matter what, representations with LDA models (or related topic modeling such as correlated topic models (CTM)) can be useful even outside NLP. I have found it useful at some intermediate layer of calculation lately.