Discriminative Adversarial Networks

Generative adversarial networks (GANs) have made a big impact to the world of machine learning. It is particularly useful for generating sample data when there are insufficient data for certain purposes. It is also useful for training using data with both labeled and unlabeled data, i. e., semi-supervised learning. (SSL)

The rise of GANs also lead to the re-emergence of adversarial learning regarding the handling of unbalanced data or sensitive data. (For example, see arXiv:1707.00075.)

GAN is particularly useful for computer vision problems. However, it is not very good for natural language problems as the data cannot be generated continuously. Under this context, a modification on GAN is developed, called discriminative adversarial networks (DAN, see arXiv:1707.02198.). Unlike GANs that has a discriminator to train a generator to produce good data, DAN has two discriminators: one discriminator, usually denoted as the predictor P, that predicts on the unlabeled data, and another, usually denoted as the judge J, that classifies whether the label is a human label or a machine-predicted label.

Screen Shot 2019-06-23 at 5.55.36 PM

The loss function of DAN is very similar to that of GAN: minimizing the entropy difference for the judge J for labeled data, but minimizing that for predictions for unlabeled data for the predictor P.

Screen Shot 2019-06-23 at 6.01.15 PM

However, GAN and DAN are not generative-disciminative pairs.

Continue reading “Discriminative Adversarial Networks”

Strategies of Recommendation Systems

Systems developed by enterprises such as Netflix produce recommendations. Good recommendations induce good user experience and higher return rates. Humans give recommendations based on experience, knowledge, worldviews, wisdom etc., and automatic recommendation systems do it based on big data and machine learning.

Recommendation Strategies

Recommendation systems employ one or more of the following strategies:

  1. Collaborative Filtering (CF);
  2. Content-based Filtering (CBF);
  3. Demographic Filtering (DF); and
  4. Knowledge-Based Filtering (KBF).

1. Collaborative Filtering (CF)

CF recommends similar items to users of similar tastes. Whether it is user-based filtering or item-based filtering, the same assumption holds. Similarity between users or items are calculated by Pearson correlations or cosine similarities.

Matrix Factorization (MF), or similar latent semantic indexing (LSI) is actually a kind of collaborative filtering, although the users or items are converted to an encoded vector, and the recommendation scores are given by the cosine similarity between the encoded vectors of the users and the items.

Such recommendation systems suffer the cold-start problem: new users or new items cannot be accounted for when giving recommendations.

2. Content-Based Filtering (CBF)

CBF employs common machine learning algorithms to learn a user’s preference based on their consumption/purchase history and their profiles. Embedded vectors will be used too. However, this suffer cold-start problem.

3. Demographic Filtering (DF)

DF strategy makes use of users’ profiles such as age, sex, and other information to make recommendations. The algorithms might be rule-based, or machine learning also. However, nowadays, it might give rise to issues regarding fairness, equal opportunities, privacy, or ethics, in the wake of the era of GDPR or CCPA.

4. Knowledge-Based Filtering (KBF)

KBF makes recommendations based on the expert knowledge of the subject matter, known reasoning, or statistics. Recommendations may be made using a rule-based approach, or a predefined probabilistic model (such as census data). Some might have even employ a knowledge database. Big data may not be necessary in this kind of systems as the reasoning has been manually built-in.

Hybrid Recommendation Systems

Hybrid recommendation systems employ more than one of the above strategies. To combine all these strategies, one might put a voting system to all the results to give an aggregated results, or a weighting scheme, or a stacked generalization to combine all these methods together.

Continue reading “Strategies of Recommendation Systems”

Blog at WordPress.com.

Up ↑