A World Without Wizards: On Facebook and Cambridge Analytica

If we want to confront these unreasonable times, we must stay clear-eyed about the state of the world and stop imagining omnicompetent villains behind the scenes.

If you had asked me about Facebook’s data policies two months ago, I would have said that (1) Facebook is effectively an information monopoly; (2) its data is almost completely unregulated; (3) this is a dangerous combination; and (4) at some point, it will lead to some very bad headlines.

If you had asked me about Cambridge Analytica (CA) two months ago, I would have said that they seemed like a pretty shady outfit, and were essentially the Theranos of political data. I said as much here, here, and here, arguing that this is just the latest iteration of an established pattern where we overestimate the power of digital microtargeting and treat marketing claims as established social forces reshaping society. CA’s claims of a major breakthrough in psychometric targeting seemed to me then and now to be complete vaporware—a bold pitch about what their technology can accomplish, with almost nothing to back it up. I also would have said that their ties to conservative billionaire Robert Mercer were evidence of political patronage schemes and potentially illegal coordination among Republican political operatives. But Mercer’s involvement isn’t evidence that CA has made some major advance in digital political communication. It’s evidence of CA building a client list regardless of whether its product works as advertised.

In short, I would have told you that it is much easier to build a political technology product you can sell than it is to build a product that actually works.

The recent revelations from CA whistleblower Chris Wylie haven’t changed my read of the situation much. It’s safe to say that we have arrived at Facebook’s very-bad-headline moment. Facebook spent years giving third-party developers ridiculous access to your data and the data of everyone in your friend networks. It did so back when no one had demonstrated that this data was valuable—and the dirty secret is it still isn’t clear just how valuable this data is.

Facebook cut off third-party friend network access in 2015, but all our old data is still floating around out there. To paraphrase Cornell Tech and Cornell Law School professor James Grimmelman, the real data breach is what Facebook had been allowing all along. Digital privacy advocates have gone hoarse over the years, shouting about these threats from the rooftops. What we have here is the perfect set of circumstances to get the public to finally listen.

It was already heavily reported before this story broke that CA had a big pile of Facebook data. We didn’t know how huge the pile was. We didn’t know how the company got it. But getting big piles of Facebook data isn’t hard, particularly for well-resourced companies with few scruples! The underlying claim that put Cambridge Analytica in the public spotlight was that Facebook data was central to the Trump campaign’s surprise victory. Did Trump get elected because of a game-changing breakthrough in precision political propaganda, or was Trump’s data operation as haphazard as every other element of the Trump campaign?

And that’s where I remain skeptical. It’s noteworthy that, when reporters ask Wylie about this, his consistent reply has been some version of “it must have been effective, because CA spent millions on it.”

Theranos also spent millions trying to work out the technical challenges of its vaporware product. Spending on research and development is not evidence of success. The evidence for psychographics rests on a handful of laboratory experiments. I do not doubt that, in theory, one could increase persuasive impact by tailoring political messages to the recipient’s psychological profile. The hard part comes when you try to roll this out, at scale, in a real campaign.

Tufts professor Eitan Hersh summed this point up well on Twitter: “In [my book] Hacking the Electorate, I laid out the rationale (with data) about the severe limits of microtargeting in campaigns. Post 2016/Post Cambridge Analytica update: No update required. Every claim about psychographics etc made by or about the firm is BS.”

Hersh continues: “‘Let’s start with fb [Facebook] data, use it to predict personalities, then use that to predict political views, and then use that to figure out messages and messengers and just the right time of a campaign to make a lasting persuasive impact.’ …sounds like a failed PhD prospectus to me.”

The reporting on the Cambridge Analytica/Facebook scandal has danced around this issue by invoking the expert judgement of researchers whose careers are tethered to the success of psychometrics. Wylie himself is still in the psychometrics business. When he describes the overwhelming power of weaponized digital propaganda, he is not a dispassionate analyst; he’s also drumming up future business opportunities. Stanford University Professor Michal Kosinski is also constantly interviewed and profiled in articles about Cambridge Analytica’s psychometric breakthroughs. Kosinski has published several articles on the topic of psychometrics, and is indeed a leading expert on the topic. But his next grant, his next paper, and his tenure case all hinge on convincing peers, peer reviewers, and the public that psychometrics represents a significant breakthrough. One should expect him to offer an optimistic assessment of the current and future viability of psychometrics. Likewise, if you ask researchers at Cambridge University’s Psychometrics Center whether psychometrics is valuable to political campaigns today, what answer do you think they’re going to give you?

David Graham wrote arguably the definitive piece on Cambridge Analytica in The Atlantic last month, titled “Not Even Cambridge Analytica Believed Its Hype.”

If Cambridge Analytica’s psychometrics were as effective as they have claimed, why would they need to even discuss engaging in this type of skullduggery? Forget the morality of engaging honeytraps or sting operations; a company that truly believed it could abseil into voters’ heads with sophisticated data and manipulate them that way would feel no need to make claims about other, less scientific methods…New technological advances have, of course, changed campaigning, but it’s wise to be skeptical of anyone who argues that he and his company have alone discovered some new trick that no one else has—especially if that’s a vague and unproven scientific claim like the one CA makes.

For many of the researchers and public intellectuals who have weighed in on this topic, the details of Cambridge Analytica’s claims are little more than a sideshow. Who cares how effective CA’s product actually was? Who cares how much harm the company was able to inflict with borderline-pilfered Facebook data? The public is finally concerned about Facebook’s surveillance capitalism! Let’s not squander this moment!

I would like to agree wholeheartedly with them, but I worry that in doing so we are also perpetuating the myth of the digital wizard.

In the research for my book Analytic Activism, I noticed that there are two competing stories that we (researchers, practitioners, journalists, and the public at large) often tell ourselves about “big data.” The first is a story of digital wizards—a new managerial class of data scientists who have now amassed near-omniscient insights into public behavior, and are using these insights to remold society to their whims. There is something comforting and appealing about the digital wizards story. It tells us that there are experts, somewhere, who are in control of things. You can hire those experts. You can challenge those experts. You can learn from those experts, perhaps even becoming an expert yourself. It is a story that fits well within a Silicon Valley pitch-deck or a journalistic story on the latest digital revolution reshaping society.

The second story we tell ourselves is about “build-measure-learn” cycles. It is a story in which data scientists and software engineers don’t have some grand plan or transcendent vision. What they have is an ability to try things out, measure performance, identify problems, fix them, and repeat. It is a story of messy workflows, kludgy and incomplete datasets, and endless trial-and-error. There are no wizards in this second story, no hidden geniuses enacting well-designed plans. There are just people—some very talented, others less so—figuring things out as they go.

The first story is comforting and appealing. But it is a lie. We live in a world without wizards. It is comforting to believe that we arrived at this strange, unlikely presidency (#darkesttimeline) because Donald Trump hired the right data wizards. That would mean Democrats (or other Republicans) could counter his advance in digital propaganda with advances of their own, or that we could regulate our way out of this psychometric arms race. That is a story with clear villains, with clear plans, and well-implemented strategies that the heroes might very well foil next time around. The truth is far muddier.

Psychometrics did not win the 2016 election. Donald Trump didn’t have a secret data engine humming underneath his jalopy of a presidential campaign. A host of unlikely events co-occurred. Racism, populism, and nationalism proved to have a deeper and wider reach than expected. Media elites and political elites failed to take the threat seriously. Facebook, Twitter and YouTube were awash in misinformation, rewarding bad actors and our worst social impulses. The director of the FBI took unprecedented actions under the assumption that Clinton’s presidency was already assured.

The appeal of the digital wizardry narrative is that it can focus outrage in socially beneficial directions. It is indeed high time for us to take a deep look at Facebook’s power in society. And maybe we wouldn’t be doing that without a distinguishable villain and a clear social failure to sharpen our attention. (As the old saying goes, you should never let a good crisis go to waste.) But I think our biennial tradition of decrying the new age of digital propaganda does more harm than good.

Facebook ought to be regulated because it is an information monopoly. It ought to be regulated because it now has monolithic importance in shaping how citizens get their news, and how the news industry raises the money to pay its employees. The broader sector of data fiduciaries ought to be regulated too—not just companies like Google, but also the companies that make up the larger data-mining industry like Acxiom and Palantir. Acxiom has approximately 6,500 datapoints on every consumer living in the United States. Some of the biggest players in the “big data” industry don’t generate scandal headlines the way Facebook does.

I’m less optimistic that this U.S. government is going to manage the task of regulating them. Zeynep Tufekci recently proposed four legislative remedies to curb Facebook’s power as an unregulated information monopoly. Her ideas are simultaneously bold and eminently reasonable. Her main proposal essentially calls for establishing the equivalent of a Consumer Financial Protection Bureau (CFPB), but for data. Yes! Of course! And yet…

Regulating the online data and online advertising industry is an eminently reasonable idea, but we live in distinctly unreasonable times. This Congress is not going to establish another agency like the CFPB. This Congress is busy dismantling the CFPB and the rest of the regulatory state. This Congress and these regulators appointed by President Trump (along with the career regulators who have stuck around, dispirited, through the first year and a half of the Trump era) lack both the skill and the will to work through the complicated issues involved in serious policy design and implementation. It’s going to take a long time to repair the damage that has been done to existing regulatory agencies. There is little reason to hope this problem will be at the top of the government policy agenda post-2018.

If we want to confront these unreasonable times, I think it’s helpful to stay clear-eyed about the state of the world. That means paying attention to Facebook’s uncurbed power, and the growing potential for harms and abuses coming from an unchecked data industry. But it also means we need to stop imagining omnicompetent villains behind the scenes. We did not get here because Cambridge Analytica enacted a secret plan through weaponized online propaganda. We got here through a series of bumbling, uncoordinated mistakes. We can, perhaps, build-measure-learn our way out of this mess. But it’s going to be a hard, rough, mistake-filled path.