2. I usually just use the default drop out or reg rate= 0.0, but you can set it: https://github.com/stephenhky/PyShortTextCategorization/blob/master/shorttext/classifiers/embed/nnlib/frameworks.py

3. No particular reason

LikeLike

]]>1. Does it really deep neural network? what I see is only 3 layers.

2. How about dropout rate & l2 constraint ?

3. Why you set number of filter =1200?

LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>Yes, Word2Vec is a shallow networks.

If by probabilistic you mean the algorithm being stochastic, I would agree that quite likely the gradient optimization is a stochastic one; If you mean it is a theory derived from Bayesian probability, I would agree too. But from what I know, it performs as good as Word2Vec.

LikeLike

]]>