Viewer-Response App Worms its Way into New Zealand Debates

Can the audience-response tool known as the worm influence who "wins" a political debate—or an election?

New Zealand Leaders Debate Nov 21 2011

The studio audience response to the Leaders’ Debate featuring Labour Leader Phil Goff and Prime Minister John Key is reflected in the “worm” shown onscreen. / Screenshot TV3




Case Study: The Reactor, an audience-response tool often referred to as the “worm” 
Country: New Zealand
Debate: TV3 Party Leaders’ Debate, Nov. 21, 2011

On the evening of Nov. 21, 2011, New Zealanders tuned in to watch the leaders of the two major parties face off for one of the final debates before the general election. At one podium stood the incumbent, John Key, leader of the center-right National Party. At the other stood his challenger Phil Goff, leader of the center-left Labour Party. Between them was the journalist John Campbell, who posed questions to the candidates and refereed their responses. And along the bottom of the screen, reflecting viewers’ reactions to the debate, was the worm.

Viewers downloaded an audience-response app onto their smartphones that offered a slider placed along a spectrum from 0 to 100. If they liked what the candidate on their TV screen was saying, they dragged the slider closer to 100. If they didn’t, they dragged the slider closer to 0. These responses would then be transmitted to Roy Morgan Research, the Australian market research firm that developed the technology, and aggregated to draw the “Reactor” (commonly referred to as the worm): a line that crawled along the bottom of the broadcast. If most viewers were reacting positively, it went up. If most viewers were reacting negatively, it went down.

The idea of gauging voters’ gut reactions and representing them as a moving line had been around for years. First used in Australia in 1993, the worm has since shown up in British and U.S. debates as well. But all previous uses, in New Zealand and elsewhere, had relied exclusively on a studio audience: A small sample of undecided voters watched the debate together and keyed their reactions into hand-held devices. What distinguished the 2011 debate was anyone with a smartphone or tablet could join in.

A History of Influence and Inconsistency

This innovation offered far greater potential for audience participation—but it also brought new risks. No sooner had the plan been announced than critics attacked it. Their first concern was its vulnerability to manipulation. One advantage of using a smaller sample was control: Debate producers could select voters who at least claimed to be uncommitted, and thus more likely to react with an open mind. A smartphone audience, on the other hand, might include many partisans who only gave their candidate high marks.

The other issue raised by critics was access. In 2011, only about 5 percent of New Zealanders owned smartphones. And, as one left-wing blogger pointed out, these tended to be wealthier citizens, who largely voted for the center-right National Party rather than the center-left Labour Party.

Aside from these smartphone-specific concerns, there were more general criticisms of the worm itself. Ever since it first started slithering across the country’s TV screens in 1996, New Zealanders have been arguing about whether it helps or hurts their democracy. The worm’s defenders claim it offers a useful measure of voter sentiment, and a refreshing counterpoint to the pundits and politicians who usually dominate the airwaves around election time. Its critics argue that it represents, at best, a trivial distraction, and at worst, a damaging distortion of the democratic process.

Reactor app by Roy Morgan Research

Waikato University political scientist Ron Smith blamed it for rewarding “worm whisperer” politicos who prefer displays of emotion and crowd-pleasing platitudes to serious policy discussion. In 2005, the High Court of New Zealand issued an opinion stating that the worm may undermine the objectivity of election coverage.

Those fears are not limited to New Zealand. In Australia, analysts have found it tends to tip in favor of progressive politicians and in response to positive language. Tell an inspirational story, and the worm will respond favorably. Go negative or use sarcasm, and you’re less likely to be rewarded. 

“If you say ‘this is the greatest city or country in the world’, the worm goes through the roof. If you say ‘well, the problem we’ve got is our hospitals don’t work’, it goes down. But we’ve found it unerringly accurate over the 20 years we’ve done it,” Ray Martin, a newsman who moderated debates in Australia for Channel Nine News, told the BBC

In 2007, Australia’s prime minister agreed to a televised debate with his Labour Party opponent only if none of the three networks covering the debate used the worm. Channel Nine News did anyway and had its live feed cut, sparking cries of political censorship.

While some researchers have argued that real-time measurement is a source of raw data, others have shown that the worm may impede viewers’ ability to form their own judgments. Three British researchers reported on an experiment with 150 university students in which they manipulated the worm and superimposed it on a live broadcast of the final of three UK election debates in 2010. They found that the manipulation influenced viewers’ judgments not only of who had “won” the debate, but their choice of preferred prime minister.

“Apart from the concerns about unintentional bias, there is real possibility that the worm could be used to systematically bias the outcome of the election,” said Jeffrey Bowers, a psychology professor at the University of Bristol who was involved with the study. “Given the small sample of undecided voters that generate the worm, just one or two persons could influence the worm by voting for one candidate no matter what. The system is cute, but open to abuse.”

[Above: Studio audience responses during the first of three leaders’ debates in the 2010 UK general election.]

In 2013, the British Parliament’s Select Committee on Communications held hearings on televised political debates and invited another of the study’s authors, Colin Davis, also a psychology professor at University of Bristol, to testify. “Even without any deliberate bias, it’s very unlikely that the worm provides an accurate indication of the views of undecided voters, given that it is based on such a small sample,” said Davis, referring to the less than two dozen people selected to use the worm during debates hosted by ITV News and the BBC in 2010. (Responses were shown during post-debate analysis, not during the live debate.)

Also testifying was Alan Schroeder, a journalism professor at Northeastern University and author of “Presidential Debates: 50 Years of High-Risk TV.” “I think [the worm] is ridiculous,” he told the committee. “First of all, I think social media, particularly Twitter, has supplanted the worm. The real-time, real reaction of the audience is now measurable in ways that make the worm obsolete.”

[Another group of British researchers who see the limitations of the worm—and also Twitter—have created a debate response app that aims to provide more insight into voters’ attitudes. See the previous story in the Rethinking Debate series: “Making the UK’s Political Debates More Responsive to Public Needs.”]

But can the worm actually sway an election? The record in New Zealand is inconclusive. When the worm debuted in 1996, it spurned incumbent Prime Minister Jim Bolger and favored his challenger, Labour leader Helen Clark. Clark enjoyed a subsequent boost in the polls, while Bolger went ballistic, threatening to boycott any future debates that included the worm. “Some days the worm will eat you, some days you’ll eat the worm,” joked the Waikato Times.

 The worm remained for the other debates, over Bolger’s objections. But it didn’t change the result: Bolger’s party won.

Yet in 2002, the worm was widely credited with catapulting a little-known politician into power. During a televised debate among the leaders of minor parties, a centrist named Peter Dunne sent the worm soaring by repeating popular phrases like “common sense” and “family.” In the aftermath, Dunne became a media sensation: “The man for whom the worm turned up trumps,” declared the New Zealand Herald. And when New Zealand went to the polls 12 days later, Dunne’s party had its best-ever showing, picking up eight seats in Parliament.

Did Dunne deserve his rapid ascent from minor politician to “potential kingmaker”? It depends on whether you believed that Dunne was a manipulative “worm whisperer,” or whether he spoke to a genuine need in the electorate not represented by mainstream politicians. In any case, his rise only fanned the flames of controversy around the worm’s role in New Zealand politics.

Studio vs. Smartphone Worms

Given this highly charged history, when New Zealand became the first country to open up the worm to smartphones in 2011, the debate’s producers were cautious. One might expect them to play up this innovation’s world premiere. Instead, they consistently downplayed it—no doubt in response to concerns about its accuracy and fairness, and the worm’s polarizing legacy in New Zealand politics.

reactor app android screenshot

Reactor app screenshots on Google Play

There were in fact two worms in action during the debate. The first was the old-fashioned handset kind, controlled by the TV3 studio audience. These were 65 uncommitted voters: “a truly representative sample of New Zealand in terms of their age, gender, and ethnicity,” debate moderator John Campbell explained in his opening remarks. The second worm would be driven by viewers at home on their smartphones.

“While there will be people reacting with open minds to what each leader is saying, there will also be party supporters wildly responding to who’s on screen at the time,” said Campbell.

Campbell urged people to pay closer attention to the former. “The second worm is not scientific,” he warned. “It is simply a chance to have your say.”

The CEO of Roy Morgan Research, Michele Levine, echoed this message in an interview the day after the debate. “The studio audience is supposed to be the important one,” she said.

These carefully worded caveats suggested the smartphone-driven worm would be highly volatile, while the studio audience worm would offer a more measured response. In fact, the opposite was true. During the hour-long debate, the app worm remained relatively calm, staying close to the horizontal axis marked “Neutral.” 

It had moments of responsiveness: ticking downwards when National leader John Key discussed his plans to privatize public assets, for instance, and upwards when Labour leader Phil Goff argued those assets should remain state-owned. But overall, the app Worm was far less excitable than the studio audience worm, which cut sharp peaks and valleys across the bottom of the screen all night.

Not only was the studio worm more reactive than the app worm, it was also considerably more one-sided. The app worm didn’t seem noticeably partisan, and it certainly didn’t reflect a National bias, as some Labour observers had feared. By contrast, the studio audience heavily favored Goff over Key. The Labour leader repeatedly drove the studio worm to the top of the graph, while Key put it into a tailspin.

Despite the moderator’s insistence that “none of us here at the studio will know what the worms are saying,” it seems likely that Goff, the underdog in the polls, hoped to reproduce Peter Dunne’s 2002 success by emulating his “worm whisperer” ways. Goff repeatedly struck an emotional note by telling the stories of individual New Zealanders who had suffered as a result of the financial crisis. His empathetic style and his focus on economic justice resonated strongly with the studio audience, leading most observers to conclude that Goff won the debate.

In fact, the result was so lopsided that conservatives immediately cried foul. In the hours after the debate ended, right-wing blogger David Farrar produced evidence that four members of the studio audience were covert Labour and Green Party supporters, sparking a scandal that swirled through New Zealand media for days.

Whatever the sources of Goff’s success, however, it didn’t do much for him at the polls. After the debate, Labour’s numbers remained flat at about 27 percent, vs. National’s 50 percent. And this proved to be a fairly good predictor of the final tally on election day: 27.48 percent for Labour and 47.31 percent for National.

It was ironic: After all the anxiety about the possibility of the app-driven worm being hijacked by partisans, it was the studio worm that triggered accusations of bias. This raised an interesting possibility: Perhaps the much-maligned at-home audience wasn’t a mob of party fanatics, as nearly every pundit in New Zealand claimed they would be. Perhaps their reactions were more typical of the electorate as a whole than the 65 citizens who composed the studio audience.

After all, how “representative” is a group of voters who are still undecided five days before the election? Leaving aside questions of Labour infiltration, such voters may react very differently than most of the electorate. They also reflect a geographical bias. As Auckland market researcher Jonathan Dodd pointed out in The National Business Review, these panels draw their members from the city where the debate is held. Voters from Auckland, where the 2011 worm debate took place, may have a different set of concerns than voters in other parts of the country.

Another major advantage of the smartphone worm is sample size. The 2011 debate had 65 voters in the studio audience, but worms in other countries have used even fewer—50 in the United Kingdom, and 30 in the United States. As science writer Simon Oxenham has observed, these “absurdly tiny sample sizes” provide a highly skewed picture of public opinion. Enabling audience participation through smartphones has the potential to make the worm a more accurate indicator of voter sentiment—but only if it engages a large, diverse, and geographically dispersed user base.

Will the Debate Worm Wiggle On?

Unfortunately, this didn’t happen in New Zealand in 2011. Roy Morgan Research never reported the number of users who used the “Reactor” app during the debate, but the available evidence suggests its use that night was limited. More New Zealanders use Android smartphones than iPhones; on the Google Play store, the Reactor app shows 5,000 to 10,000 downloads, and that includes the last five years.

The country’s low rates of smartphone penetration certainly played a role, but so did widespread voter apathy. Voter turnout in 2011 was 74.21 percent—the country’s lowest since 1887. And even in an especially listless election season, the debate with the worm was the least-watched of the three major debates between Key and Goff: Only 276,000 New Zealanders tuned in, compared with around half a million for the others.

In 2011, New Zealand introduced an important innovation into the world of political debates—but left its potential unfulfilled. And in the years since, no one has repeated the experiment. The most recent elections in New Zealand in 2014 were worm-free, and the worm’s other television appearances around the world, including the UK in 2015, have relied only on studio audiences.

Meanwhile, smartphone penetration has grown rapidly. Device costs have plummeted, while processing power and bandwidth have increased. The conditions for an app-driven worm are much better today than they were four years ago, both in New Zealand and elsewhere. Yet so far, no debate producer has taken it on.

Of course, old concerns remain. A smartphone worm can enlarge the sample size and make it more representative of the electorate. But it does little to quiet critics’ fears that real-time audience feedback undermines the quality of the debate by pressuring candidates to play for the worm with pandering rhetoric. Several thousand people are likely to be just as susceptible to “worm whisperers” as a few dozen.

Neither will a smartphone worm provide a better prediction of voter preferences on election day. Just because a worm likes a candidate doesn’t mean he or she will win. Real-time reactions don’t necessarily translate into votes, as New Zealand’s recent history makes clear.

Ben Tarnoff is a writer. His books include “The Bohemians” and “A Counterfeiter’s Paradise.” 

With additional reporting by Christine Cupaiuolo.


View more stories in the Rethinking Debates series, a look at debate formats and innovations that increase civic engagement, and sign up for the Rethinking Debates newsletter for the latest posts and news.