“One morning I shot an elephant in my pajamas. How he got in my pajamas, I don't know.”
That Groucho Marx quote illustrates why it’s difficult for computers to understand humans. When programming for computers to understand humans, one must account for vagueness, ambiguity and uncertainty to distil the meaning of human language.
Facebook announced today that it can now do that with Deep Text, a deep learning-based text-understanding engine running a neural network that can understand with near-human accuracy the textual content of several thousands of posts per second—and in more than 20 languages.
Consumers interacting with computers
Consumers regularly interact with computers trained with machine-learning techniques that understand human language. Ask Siri for the best Japanese restaurant in San Francisco, and Siri will give you a list of restaurants. Or ask Google how many people live in Lisle, Illinois, and Google will reply with the answer from the U.S. census. These intelligent systems parse the question like school children, using syntax to diagram a sentence and then answer the question with structured data sets: the list of restaurants labeled San Francisco and Japanese or the quantity of people labeled Lisle, IL, in the census database.
Deep Text’s understanding of human language is state of the art because the depth of its understanding includes intent, sentiment and entities (e.g., people, places, events). Facebook engineers taught Deep Text to teach itself with very large unlabeled data sets of text. In other words, Deep Text is programmed to learn by observing how people communicate. There are other examples of a neural network teaching itself by observing humans. For example, an autonomous car adapted with neural network-controlled steering taught itself to steer by observing a human driver.
Facebook engineers Joy Zang’s and Ahmed Abdukader’s minute-and-a-half video is a good introduction to the inspiration behind the project:
Deep Text uses unsupervised machine learning to interpret the meaning of posts and comments. The distinction between supervised and unsupervised machine learning depends on the data set used to train the system. Most machine learning applies to computers programmed and designed for processing vectors and matrices called neural networks to understand labeled datasets. Labeled data sets include the word and its meaning, such as long lists of restaurants in San Francisco that are Japanese or pictures of animals labeled cat, dog, horse, rhinoceros, etc. Deep Text applies neural networks to unlabeled data sets of posts and comments to understand their meaning and sentiment. Unlabeled data sets are exactly that, just words, sometimes slang and sometimes misspelled, without a dictionary and without predefined relationships with other words.
Words are interpreted, sometimes character by character, using machine learning systems Flow and Torch [5] to embed every word’s relationships with one another to train the neural network. The Deep Text neural network can be thought of as a large space in which all of the words are suspended with multiple pointers or vectors linking to other words to define their semantic relationships. The process called embedding puts related words, such as bro and brother, near one another, and it can be used to disambiguate whether the sentence “I like blackberry” refers to a fruit or a smartphone.
The neural network extracts the meaning of a sentence from the proximity of the words with other words. Using word embeddings, Deep Text can also understand the same semantics across multiple languages, despite differences in the surface form. For example, the English form “happy birthday” and Spanish “feliz cumpleaños” should be very close to each other in the common embedding space. By mapping words and phrases into a common embedding space, Deep Text is capable of building models that are language-agnostic.
Neural networks applied to moderate comments
Facebook is considering applying Deep Text to better personalize comments. To better understand the value and challenges of comments, I spoke with Matt Carroll, MIT research scientist and a former Pulitzer Prize winning Boston Globe reporter.
Carroll summed up the state-of-the-art methods used today to cultivate comments in media: “Every media company struggles with comments, leading some to drop comments all together; however, they are the best way to communicate with readers. But cultivating these conversations and keeping the trolls from taking over is labor intensive. People and software are needed to make these conversations valuable to readers and publishers.”
With Carroll’s comments in mind, today’s methods of cultivating comments compared to the potential application of Deep Text could represent productivity improvements similar to the shift from farming by hand to farming with tractors.
Celebrities and public figures use Facebook to start conversations with the public. These conversations often draw hundreds, if not thousands, of comments. Finding the most relevant comments in multiple languages while maintaining comment quality is a challenge. Deep Text may be able to surface the most relevant or high-quality comments.
Improving the understanding of posts with images using Deep Text
Deep Text has the potential to improve personalized Facebook experiences by understanding posts better to extract intent, sentiment and entities using mixed content signals such as text and images. For example, a post by a friend that includes a picture of her new baby and the text “Day 25” makes it clear that the intent is to share family news.
Automating the removal of objectionable content is another potential use case. It might also help Facebook comply with its non-binding agreement with the EU to fight online hate speech.
Facebook has published, and others have written about, the company’s research on its use of machine learning to understand images. This research is best summed up with an excerpt from Facebook AI Research Director Yann Lecun’s talk at the MIT Technology Review’s EmTech conference last fall:
Discovering Facebook users’ interests
The model can be improved to understand Facebook users’ interests with supervised learning. It could, for example, create a labeled data set mapping user interests. Example: Tom Smith is interested in Stephen Curry because he likes basketball, frequently comments about the Golden State Warriors and the sentiment of his comments about Curry are always positive.
Most users would be overwhelmed by the many posts in their Facebook newsfeed if Facebook didn’t rank and prioritize them. Understanding users’ interest will more accurately rank posts, so users see more of what interests them.
An interesting classifier called PageSpace has been built with Deep Text, using the huge data set of labeled data contained in the millions of active Facebook pages dedicated to particular topics or interests. This data set isn’t as well-structured as a curated data set. However, Deep Text can create labeled data sets by understanding the words, intent, sentiment and entities on these pages.
Other Deep Text applications
Deep Text intent recognition can understand a Facebook user’s intent to sell or buy something with a post to his or her newsfeed. In response to this signal, the option to use Facebook tools that make buying and selling easier could be presented.
Deep Text could also be applied to Facebook messenger to recognize a user’s intent to call a taxi. In response to this signal, Deep Text could offer to call a taxi.
Developing a deep learning platform within the open AI community
Because the AI community’s roots run deeply into academia and research, it operates very transparently. For instance, the paper cited by Zhang and Abdulkader written by Ronan Collobert included Jay Weston of Google as a co-author. Large companies such as Facebook, Amazon, Google, IBM and Microsoft are leading in the adoption of AI and machine learning with products like Amazon Echo, Google Home, IBM Watson and Microsoft Cortana. The research leaders at these companies, such as Geoff Hinton of Google and Yann LeCun, often retain their positions in academia. LeCun is a tenured professor at NYU and Hinton at the University of Toronto. The researchers in academia and commercial enterprises have long histories of working together. They publish frequently and collaborate in conferences, presumably to accelerate development.
Shared research extends beyond published papers into open source software. Most software, such as Torch used in the Deep Text project, was developed as an open source project to accelerate and cross-pollinate its development.
Facebook would make an interesting case study to explain the implementation of machine learning. Its three tiers of AI developers include Facebook AI Research, Applied Machine Learning and product development teams. Developing a system based on machine learning requires specialized expertise. Facebook recently released its machine learning platform, FBLearner Flow, which is designed for developers without specialized AI expertise to build products that use machine learning. The result: 25 percent of Facebook’s army of developers has at least experimented with machine learning.