Facebook has just two goals in assembling each user’s News Feed:
- Show the smallest number of ads that produce the highest clickthrough yield for an individual user.
- Increase the amount of time an individual user spends interacting with his or her News Feed so the span of posts can be enlarged and the number of ad impressions increased.
It is that simple. But how those are done is not simple.
Facebook uses artificial intelligence (AI), machine learning (ML) specifically, to choose what a user sees in his or her Facebook News Feed. In other words, the News Feed is personalized for each user.
Short explanation of how machine learning works
An over simplified explanation about machine learning will help to understand how to keep fake news and other content the user doesn’t want to see out of his or her News Feed. There are two stages to ML: building and training the model and inference or running the model. There are off-the-shelf and open-source ML packages, such as Tensorflow, Torch and the Cognitive Toolkit. These ML tools are used to build a prediction model and train it to predict something with a high probability of accuracy that the prediction is accurate.
The most commonly discussed ML use case is predicting the content of a photo. The ML system is trained using large labeled datasets of digital images that are input into the ML model. For example, a large numbers of images of dogs labeled dog, cat labeled cat, car labeled car, etc. in different orientations are the input. After the images have been input, the prediction accuracy is tested and corrected to increase the accuracy to a useful level using back-propagation.
+ Also on Network World: Facebook just had a very, very bad week +
During back-propagation, the images are reprocessed without labels and the prediction of the content in the image is compared to the correct label. When an error is detected, back-propagation does just that: it propagates the correct state back into the model so that the next similar image is correctly recognized. Back-propagation is repeated until the probability of accurate prediction is high. There is a well-written five-part blog post by Adam Geitgey that explains step by step all the technical details about how this works.
Moving to the inference stage, after the image recognition model is trained, an image of, say a car, is input, and the system predicts with accuracy that it is a car. It is just a probability of accuracy. In image recognition, the probability of a correct prediction is approaching 95 percent and nearing human accuracy.
How machine learning assembles a Facebook user’s News Feed
Facebook built a model to predict the ad posts a user will click on and which posts of images, videos and stories a user will interact with by liking, commenting and sharing. During the learning stage, the model is trained for each user, so it might predict one user is highly likely to interact with a post and predict another is highly unlikely to interact with the same post. The dataset with which the model is trained is each user’s history on Facebook’s Open Graph where everything related to the user, such as friends, posts, comments, likes and image metadata, is stored and interlinked.
When the user logs into Facebook, the inference stage of the News Feed model executes and predicts which ads and posts queued for that user’s News Feed the user is most likely to interact with. The prediction algorithm is the individual user’s algorithm. It is his or her algorithm—a specific algorithm that predicts for only that specific user, not a random algorithm that predicts for everyone.
How a user can keep fake news out of the News Feed
If a user interacts with only IEEE Spectrum and Scientific American posts and posts only similar articles, his or her News Feed will be predominantly technical and scientific. If the user interacts with only Buzzfeed listicles and posts Buzzfeed listicles, his or her News Feed will be filled with listicles. If a user interacts with fake news, he or she will see more fake news in the News Feed. A critical look at a post before a click, like or reshare will prevent the user from seeing more fake news. The vigilance is akin to checking the browser address bar to make sure that the site visited is authentic to prevent malware.
Likewise, not posting fake news will keep fake news out of the user’s News Feed.
Facebook and Google will prevent the fake news sites from monetizing their sites with their ads. While this action is correct, it will not stop fake news that is financed with sources other than ad revenues.
Neither user diligence or Facebook’s intervention are perfect solutions. To clean up the News Feed, it is incumbent on the user to apply due diligence to the post before they interact with it. That means not liking something just because a close friend posted it and the image accompanying the news story is pleasing.
Most users will have to break the habit of sharing a post because it has a pithy headline and is accompanied by a pleasing image and actually read the post. To be truly diligent, the user should check the domain that originated the post to see if it is a reputable news source and when uncertain, check the website. The LA Times and Washington Post, for example, are widely accepted as reputable news organizations. However, sites such as AmericanNews.com and Empire Herald, included on Merrimack College professor Melissa Zimdars’ list of "False, misleading, clickbait-y and satirical 'news' sources" (Google doc) are not credible—though their names may cause you to think they are.
Facebook could train an ML model to detect these fake news sites and ban them. The model would be trained in the same way that the image recognition models are trained. The training dataset would be the fake news sites. Facebook already had models to detect and delete pornography, hate speech and other offending content. But banning fake news could cross the line of the First Amendment’s freedom of speech and freedom of the press protection.