How does Facebook know what we like?

File photo: AP

File photo: AP

Published May 4, 2014

Share

Washington - We all know by now that Facebook isn’t cool. And yet somehow it’s more popular than ever.

Recently the company announced that its growth continues to surge – not only in terms of the sheer number of Facebook users, but in terms of how much they use the site. On any given day, Mark Zuckerberg said, 63 percent of Facebook’s 1.28 billion users log into the site. And the proportion of users who log in at least six days a week has now surpassed 50 percent.

How is it possible that Facebook keeps getting more addictive over time, rather than less? It’s possible because Facebook knows what you like – and it’s getting better at understanding you all the time.

As much work and data – your data – as Facebook feeds into its targeted advertising, it works at least as hard at figuring out which of your friends’ posts you’re most likely to want to see each time you open the app. Advertisers may butter Facebook’s bread, but its most pressing interest of all is in keeping its users coming back for more. If it ever fails at that, its advertising business will implode.

So how does Facebook know what we like? On a recent visit to the company’s headquarters in Menlo Park, California, I talked about that with Will Cathcart, who oversees the product management teams that work on the company’s news feed. The answer holds lessons for the future of machine learning, the media, and the internet at large.

Facebook launched the news feed in 2006, but it didn’t introduce the “like” button until a year later.

Only then did the site have a way to figure out which posts you were actually interested in – and which new posts you might be interested in, based on what your friends and others were liking.

In the years since its launch, the news feed has gone from being a simple chronological list to a machine learning product, with posts ranked in your timeline according to the likelihood that you would find them interesting.

The goal is to ensure that, for example, the first picture of your best friend’s new baby would take precedence over a remote acquaintance’s game score.

For a while, Facebook likes – coupled with a few other metrics, like shares, comments, and clicks – served as a pretty decent proxy for engagement.

But they were far from perfect, Cathcart concedes. A funny photo meme might get thousands of quick likes, while a thoughtful news story analysing the conflict in Ukraine would be punished by Facebook’s algorithms because it didn’t lend itself to a simple thumbs-up.

The result was that people’s news feeds became littered with the social media equivalent of junk food. Facebook had become optimised for stories that people Facebook-liked, rather than stories that people actually liked.

Each time you log in, Facebook’s algorithms choose from about 1 500 possible posts to place at the top of your News Feed.

Worse, many of the same stories that thousands of people Facebook-liked turned out to be ones that thousands of other people genuinely hated. They included posts that had clicky headlines designed to score cheap likes and clicks, but that actually led to pages filled with spammy ads rather than the content that the headline promised.

But in the absence of a “dislike” button, Facebook’s algorithms had no way of knowing which posts were turning users off.

Eventually, about a year ago, Facebook acknowledged that it had a “quality content” problem.

This is not a problem specific to Facebook. It’s a problem that confronts every company or product that harnesses data analytics to drive decision-making.

So how do you solve it? For some, the answer might be to temper data-driven insights with a healthy dose of human intuition. But Facebook’s news feed operates on a scale and a level of personalisation that makes direct human intervention not feasible. So the answer was to begin collecting new forms of data designed to generate insights that the old forms of data – likes, shares, comments, and clicks – couldn’t.

Three sources of data in particular are helping Facebook to refashion its news feed algorithms to show users the kinds of posts that will keep them coming back: surveys, A/B tests, and data on the time users spend away from Facebook once they click on a given post – and what they do when they come back.

Surveys can get at questions that other metrics can’t, while A/B tests offer Facebook a way to put its hunches under a microscope. Every time its developers make a tweak to the algorithms, Facebook tests it by showing it to a small percentage of users. At any given moment, Cathcart says, there might be 1 000 versions of Facebook running for different groups of users. Facebook is gathering data on all of them, to see which changes are generating positive reactions and which ones are falling flat.

For instance, Facebook recently tested a series of changes designed to correct the proliferation of “like-bait” – stories or posts that explicitly ask users to hit the “like” button in order to boost their ranking in your news feed. Some in the media worried that Facebook was making unjustified assumptions about its users’ preferences.

In fact, Facebook had already tested the changes on a small group of users before it publicly announced them. “We very quickly saw that the people we launched that improvement to were clicking on more articles in their news feed,” Cathcart explains.

When users click on a link in their news feed, Cathcart says, Facebook looks very carefully at what happens next. “If you’re someone who, every time you see an article from the New York Times, you not only click on it, but go offsite and stay offsite for a while before you come back, we can probably infer that you in particular find articles from the New York Times more relevant” – even if you don’t actually hit “like” on them.

At the same time, Facebook has begun carefully differentiating between the likes a post gets before users click on it and the ones it gets after they’ve clicked. A lot of people might be quick to hit the like button on a post based solely on a headline or teaser that panders to their political sensibilities. But if very few of them go on to like or share the article after they’ve read it, that might indicate that the story didn’t deliver.

Some have speculated that Facebook’s news feed changes were specifically targeting certain sites for demotion while elevating the ranking of others. That’s not the case, Cathcart insists. Facebook defines high-quality content not by any objective ranking system, but according to the tastes of its users.

If you love Upworthy and find the Times snooze-worthy, then Facebook’s goal is to show you more of the former and less of the latter.

“The perfect test for us,” Cathcart says, “would be if we sat you down and gave you all 1 500 stories and asked you to rearrange them from 1 to 1 500 in the order of what was most relevant for you. That would be the gold standard.” But that’s a little too much testing, even for Facebook.

For a lot of people, the knowledge that Facebook’s computers are deciding what stories to show them – and which ones to hide – is galling. Avid Twitter users swear by that platform’s more straightforward chronological timeline, which relies on users to carefully curate their own list of people to follow.

But there’s a reason that Facebook’s engagement metrics keep growing while Twitter’s are stagnant.

As much as we’d like to think we could do a better job than the algorithms, the fact is most of us don’t have time to sift through 1 500 posts on a daily basis.

So, even as we resent Facebook’s paternalism, we keep coming back to it. And just maybe, if Facebook keeps getting better at figuring out what we actually like as opposed to what we just Facebook-like, we’ll start to actually like Facebook itself a little more than we do today. – Slate/ The Washington Post News Service

Related Topics: