Yes, Machines Can Comprehend Words

The key is reusable language model.

An essay of a paper: Universal Language Model Fine-tuning by Jeremy Howard and Sebastian Ruder.

Deep learning has become the most popular machine learning algorithm recently because it works well for generalizing patterns in various domains, including human language — also known as natural language processing (NLP). Example of NLP usage are for email servers to detect spam and chat-bot to answer questions from human. However, progress in deep learning for NLP has been relatively slow compared to other domains such as image processing. To address this issue, two deep learning practitioners proposed a method to accelerate how machine can model a language.

Howard and Ruder (2018) argue that the problem in NLP is the lack of collective language modelling; researchers initiate their own language model from scratch, thus require a lot of data and time to achieve decent performance. The concept of using existing model to train another is called transfer learning, and it is still uncommon in NLP. To prove their hypothesis, they developed a method called Universal Language Model Fine-tuning (ULMFiT) and try to automatically detect sentiment of a movie review in popular website IMDB — whether a review say something positive or negative about the film.

The usual approach to build a sentiment analyzer is directly learn from example of positive and negative reviews. In ULMFiT, they first developed a language model using wikipedia articles, and then leverage the model to build sentiment analyzer. ULMFiT contains three new techniques on how to tune an existing model to suit specific use case: discriminative fine-tuning, gradual unfreeze, and slanted triangular learning rates. These techniques help overcome catastrophic forgetting: the tendency of a deep learning model losing what it has learnt when they are given new data and objective.

The ULMFiT movie review sentiment analyzer outperforms the state-of-the-art model accuracy with just a hundredth of training data. Not only in IMDB data, their fine-tuned models perform better in other tasks such as news topic and question classification. The consistent results suggest how reusing a general language model can help in different type of tasks and that their fine-tuning method is working. Nonetheless, there has not been any evidence that the ULMFiT can be used for other languages. The next step is to study if other languages exhibit similar behavior as English; it is interesting to see if it also works for non-Latin alphabet languages like Chinese.

Resources for computer to learn natural languages are scarce, especially in language other than English. ULMFiT open up possibility to do transfer learning in NLP by using one model to solve other similar or more specific problems with shorter time and limited data. Application of ULMFiT for niche use-case like question answering and text summarization will be an interesting exploration. We possibly do not need as much data as we thought.

Development of deep learning in NLP will be much faster when every language has a decent and diverse general language model from which researchers can use collectively. In addition, computer scientists need to work together with linguists to understand why this approach works so we can keep improving the methodology. The time when machines can write an essay about a paper may not be in a distant future.


This is an essay for a task from an online course: Writing in the Sciences by Stanford in Coursera. Great course particularly if you want to learn how to write a scientific paper. Highly recommended!

Find me in your nearest Ramen stall.

Find me in your nearest Ramen stall.