How Cambridge Analytica Exposes Life in Facebookistan

The New York Times and The Observer of London made big waves Saturday with a front-page story adding more details to what is known about data-firm Cambridge Analytica’s role in the 2016 election, revealing with the help of whistleblower and former CA employee Christopher Wylie that the company “harvested private information from the Facebook profiles of more than 50 million users without their permission, making it one of the largest data leaks in the social network’s history.”

That framing, by reporters Matthew Rosenberg, Nicholas Confessore, and Carole Cadwalladr, led many observers to refer to what happened as a “breach” and had lawmakers in both the U.S. and the U.K. rushing to call for hearings on why Facebook hadn’t told its users that their information was being used. For example, Senator Amy Klochubar (D-MN), said she wants Facebook CEO Mark Zuckerberg to come before the judiciary committee to explain “what Facebook knew about misusing data from 50 million Americans in order to target political advertising and manipulate voters.”

As usual with tech and politics, bad reporting combined with weak technical literacy among politicians is confusing matters (though of course I’m all for Zuck being asked questions by Congress). As Facebook VP and general counsel Paul Grewal explained in a company blog post late Friday, this was not a data leak or breach. In fact, as has been previously reported a year ago by The Intercept’s Mattathias Schwartz, an academic researcher at Cambridge University named Aleksandr Kogan had developed a Facebook app that promised users a personality profile in exchange for their taking a quiz and allowing the app—like tens of thousands of other Facebook apps—to access private information from their profiles and that of their friends. To reach a large number, he paid American taskers on Mechanical Turk a $1 or $2 to download the app and take his survey, and from a seed group of nearly 300,000 was able to access personal data of what The Intercept said was 30 million American Facebook users. OK, so they were off by 20 million.

As Facebook lawyer Grewal was careful to note, the 270,000 people who downloaded Kogan’s app “gave their consent for Kogan to access information such as the city they set on their profile, or content they had liked, as well as more limited information about friends who had their privacy settings set to allow it.” Still, the Times/Observer story showed that while Kogan had Facebook’s OK to use this information for “academic purposes,” the revelation that it was indeed handed to Cambridge Analytica and used for election targeting purposes caused Facebook to ban CA, its parent company SCL, Kogan, and somewhat oddly, whistleblower Wylie, from the site. And Wired’s Issie Lapowsky usefully notes that engineers inside CA recall seeing a database called “Kogan-import” in their system. 

To be crystal clear, up through 2014 it was still completely legal under Facebook’s API rules for Kogan and by extension CA to get its hands on this kind of personal data. The 2012 Obama campaign built an app that amassed one million users, and which its own data scientists bragged, post-election (as in this story on our old site techPresident by Nick Judd), had been every effective in enabling the “targeted sharing” of content and calls to action to the friends of Obama supporters. As GOP data wonk (and NeverTrumper) Patrick Ruffini helpfully reminds us, the Obama team said back then that they could reach 98 percent of the U.S. Facebook population in 2012 off that one million user base. 

Asked about this now, Riyad Ghani, who was chief data scientist for the Obama 2012 campaign, tells me that unlike Cambridge Analytica, “We did not buy or access any Facebook profile data that was collected for another purpose. We explicitly asked our supporters to give us permission (through the standard Facebook protocols) to access this data. This data was only used to ask for their help in contacting their Facebook friends (through Facebook sharing and tagging) for a variety of asks (registration, turnout, etc.).” Carol Davidsen, who was the director of integration and data analytics for Obama 2012, reminds us that when the campaign realized it lacked the phone numbers for many young people in swing states, it was able to dive deep into the friend lists of the one million people who had signed onto the campaign’s Facebook app. Roughly 85% of the missing numbers were found there, according to this story in Time. She tweets that “Facebook was surprised we were able to suck out the whole social graph, but they didn’t stop us once they realized that was what we were doing.” She adds, “They came to office in the days following election recruiting & were very candid that they allowed us to do things they wouldn’t have allowed someone else to do because they were on our side.”

Back in 2012, I was on a panel at SXSW with then-Obama digital director Teddy Goff, where I criticized how both major campaigns were digging deeply into their supporters’ social network profile information, a process I called “Facebookization. You can see a screenshot of what the Obama app was collecting in a post I did about that panel. Later that year, my friend and colleague Zeynep Tufekci took to the pages of The New York Times to decry how the Obama data whizzes had begun to weaponize micro-targeting, but few people listened then. (Unfortunately, none of us early critics thought it useful or plausible back then to suggest that the biggest threat from data-driven targeting would be a foreign actor like Russia using it to mess with our democratic process.)

But now that the tables have turned and you can’t fail to sell a story with the words “Russia, Cambridge Analytica, Facebook and Trump” in its headline*, it’s more than a little annoying to see outlets like The Guardian hyping its coverage of this latest news with headlines like “I made Steve Bannon’s psychological warfare tool.” We still don’t know if CA’s claims about its targeting and messaging hyper-powers are much more than tech vendor hype, as our contributing editor Dave Karpf has pointed out previously in these pages. That said, it does appear Wylie can prove that Cambridge Analytica’s CEO Alexander Nix was lying or misinformed when he told British MP’s that his company hadn’t harvested or used Facebook user data in its election work, and given the tougher privacy laws across the pond this is serious indeed.

Forgive me for being frustrated by the spectacle of lawmakers suddenly asking questions about what is essentially Facebook’s business model as if they were just born yesterday, and investigative news reporters hyping “data breaches” when in fact these very same lawmakers and journalists are themselves running campaigns and working for corporations that themselves collect gobs of user data with little awareness themselves. To take one contemporaneous example, here’s Facebook security chief Alex Stamos schooling The Guardian’s for its overheated coverage, pointing out that the paper’s own Android app also collects a lot of personal user data.

Still, as Justin Hendrix points out with his latest post on Just Security, it would be useful to know if the various Trump campaign officials who have testified to Congress about Cambridge Analytica’s role have been truthful, if there were campaign finance violations, and also if Facebook had a greater duty to protect its own users’ data than it exercised. On that last point, two former top staffers at the Federal Trade Commission told the Washington Post Sunday that Facebook may be in serious violation of a privacy consent decree it agreed to back in 2011 that they helped craft. The decree “required that users be notified and that they explicitly give their permission before data about them is shared beyond the privacy settings they have established,” making it a key question whether the Facebook privacy permissions at the time were so broad that the decree’s rules were routinely violated.

We still also don’t know—as Karpf has written—whether the 50 million user profiles obtained by CA were actually used to help the Trump campaign (one presumes the data was matched to the voter file and offered more useful insights than the usual match to consumer data); whether so-called psychographic targeting actually works any better than just generally targeting people by their known identity traits; and probably most important, if Cambridge Analytica played a much more serious behind-the-scenes role in helping getting stolen DNC emails leaked or in trying to get hold of Hillary Clinton’s deleted emails.

That said, don’t hold your breath waiting for either Congress or Facebook to take meaningful steps to reign in the rampant use of people’s profile data for political or commercial targeting; or for mainstream media to do a better job of explaining the difference between a “data breach” and the regular everyday abuse of data people have blindly and willingly given up because they’re too lazy to read a so-called “privacy policy”; or for anyone to feel much sympathy for Facebook in the current moment given how it has lorded over the rest of us, in media and beyond, for so long.

I’ll let Siva Vaidhyanathan of the University of Virginia, who has a new book coming out later this year on Facebook, get the last word: “If back in 2004 Mark Zuckerberg had sought my counsel I would have told him to ‘move slowly and fix things.’ That may be why he’s rich and I’m not and why I’m a professor and he’s not. But I’ve yet to break democracy. So there’s that.”

*Yes, I know I’m not innocent!