Over the long Memorial Day weekend, a Twitter storm blew in about bots, those little automatic programs that talk to us in the digital dimension as if they were human.

What first caught the attention of Darius Kazemi was the headline on an article from NPR, “Researchers: Nearly Half of Accounts Tweeting About Coronavirus Are Likely Bots” — which Hillary Clinton retweeted to her 27.9 million followers — and a similar headline from CNN.

Mr. Kazemi thought, “That seems like a lot.” An independent researcher and internet artist in Portland, Ore., and a 2018 Mozilla Fellow, Mr. Kazemi has spent considerable time studying the nature and behavior of bots. Stereotypically, bots run amok on social media, at Russia’s behest. Some would argue that there is a vast and often troublesome population of bots out there: In one recent paper — “What Types of Covid-19 Conspiracies Are Populated by Twitter Bots?” — the author noted that some bots were hijacking Covid-19 hashtags with disinformation and conspiracy hashtags, such as #greatawakening and #qanon.

But Mr. Kazemi thinks the bot plot against America is exaggerated.

There are major unknowns: How pervasive are nefarious bots, really? What is their real effect? Don’t they mostly tweet at each other? And, fundamentally, what is a bot? (For instance, sometimes it is difficult to tell a bot from a troll, which is an antagonistic human just spoiling for a fight, or a cyborg, which is a human-run account that intermittently deploys a bot.)

Mr. Kazemi also makes bots; he has been called “a deeply subversive, bot-making John Cage.” (His bot “Two Headlines” crawled Google News, picked two headlines at random and mashed up keywords on Twitter, for example: “ABBA crosses Korean border for summit.”) He defines a bot as “a computer that attempts to talk to humans through technology that was designed for humans to talk to humans.”

Skeptical of the “nearly half” claim, Mr. Kazemi found the source of the article, a news release from Carnegie Mellon University about the research of Kathleen Carley, director of the C.M.U. Center for Computational Analysis of Social and Organizational Systems; since January, Dr. Carley had collected more than 200 million tweets discussing the coronavirus or Covid-19. “We’re seeing up to two times as much bot activity as we’d predicted based on previous natural disasters, crises and elections,” she said in the release.

Mr. Kazemi had hoped to find a research paper, with data and code; no luck. “That was disheartening,” he said. Yoel Roth, Twitter’s head of site integrity, tweeted that the company had “seen no evidence to support the claim that ‘nearly half of the accounts Tweeting about #COVID19 are likely bots.’” He included a thread from the Twitter Communications team labeled “Bot or not?” that walked through the taxonomic nuances.

Dr. Carley said in an interview that she was reluctant to provide data before publication because she didn’t want to be scooped; she also didn’t want to violate Twitter’s terms of service. (The terms allow distribution of tweet and user I.D.s for peer-review or research validation, but the details can get complicated.)


Image
Credit…Tojo Andrianarivo for The New York Times

“The last time we sent out a bot paper with the data at the same time, someone else stole our data and published our paper before we did,” Dr. Carley said. “Stuff will come out when it gets accepted for publication.” She added that she decided to share preliminary findings in response to queries from journalists and colleagues: “It seemed important that people knew about Covid-19. We thought we were doing a service.”

Scientific preprints are proliferating during the coronavirus pandemic, with researchers rushing to release timely results. And news outlets can be overzealous in jumping on results without a critical lens, much less analyzing the data. But the dearth of data was a red flag for Mr. Kazemi. He dug in with Twitter threads: Unless we posit that there are more bots than people out there on social media, he wrote, “there needs to be extremely good data to make a claim that half of all conversation about Covid-19 is from bots. The burden of proof is huge and not met.”

Others weighed in on Twitter as well. Kate Starbird, director of the Emergent Capacities of Mass Participation Laboratory at the University of Washington, asked: “Are automation & manipulation still a problem here? Yes. Should Twitter do better? Absolutely. But we researchers need to be precise in how we talk about different behaviors, including how we label ‘bots.’”

Brendan Nyhan, a professor of government at Dartmouth College, said: “Argh. What matters is the number of tweets people *see*. Bots can post infinity tweets into the ether. *Measure exposure not tweets.*”

Alex Stamos, director of the Stanford Internet Observatory, called it “L’Affair COVID Bots,” and noted, “Disinformation about disinformation is still disinformation, and is harmful to the overall fight.”

In early June, a similar story emerged about bot prevalence in the Twitter discourse around the protests over the killing of George Floyd. An article in Digital Trends reported that bots were spreading conspiracy theories and disinformation around the protests and the Black Lives Matter hashtag. The story cited Carnegie Mellon research indicating that 30 to 49 percent of accounts tweeting about the protests were bots.

These claims again raised skepticism and concern, from Mr. Kazemi and others.

Joan Donovan, research director of Harvard’s Shorenstein Center on Media, Politics and Public Policy, said that academics, when they release novel and shocking findings — whether publishing in a journal or by news release — have a responsibility to provide the evidence. “Dropping a statistic into the world without any explanation of what kind of content is attached is particularly troubling, especially related to the Black Lives Matter hashtag,” she said.

Dr. Carley, elaborating in a phone interview, said that she had a few ongoing social media projects, including studies on Covid-19 and the election. She uses a bot-detection tool developed at C.M.U. called Bot-hunter.

“I have said to everyone who has asked me, bots in and of themselves are not nefarious,” Dr. Carley said. “Bots are just software. They are used for good things, and they are used for bad things.”

She noted that of all the Black Lives Matter tweets collected so far in her research (bot and not), 90.6 percent were in support of the movement, 5.6 percent were not supportive, and the balance were neutral. The subset of bot tweets, she said, “did not appreciably affect those ratios” — bots were expressing overwhelming support for the protests, and often they were simply retweeting news, or rebroadcasting messages from the World Health Organization or Centers for Disease Control and Prevention.

How to find a real bot

Motivated by the headlines, Mr. Kazemi, in the intervening days, began a bot audit, manually inspecting data sets of suspected bots and verifying their existence in the wild. He focused on data used to train the machine learning algorithm that drives Botometer, a bot-detection tool by the Network Science Institute and the Center for Complex Networks and Systems Research at Indiana University, which “checks the activity of a Twitter account and gives it a score based on how likely the account is to be a bot.” A score of 0 is most humanlike, a score of 5 is most bot-like.

Other researchers do similar work. Manlio De Domenico, a physicist at the Bruno Kessler Institute in Trento, Italy, created the “Covid19 Infodemics Observatory,” which surveys about 4.5 million tweets daily. During the peer-review process for a paper, “Assessing the risks of ‘infodemics’ in response to Covid-19 epidemics,” his lab validated 1,000 user accounts. (The analysis took 12 people two weeks to conduct.)

Jonas Kaiser, of Harvard’s Berkman Klein Center for Internet & Society, and Adrian Rauchfleisch, of National Taiwan University, audited Botometer for their preprint paper, “The False Positive Problem of Automatic Bot Detection in Social Science Research.” Dr. Kaiser noted that algorithms are only as good as their training sets and generally perform worse when applied on unknown data.

“We found that the tool that is generally understood to be the ‘gold standard’ of the field is unreliable with its detection of bots, and it gets worse when tracking the bot classifications over time as well as for other languages,” Dr. Kaiser said.

Michael Kreil, a data journalist in Berlin, has been auditing bots since shortly after the 2016 U.S. election. Late last year he gave a talk titled, “The Army That Never Existed.” The précis: “‘Social bots have influenced elections. Does it sound plausible? Yes. Is it scientifically founded? Not at all.”

Defining the bot is a tricky problem; technically, it could be any automated account, like a news aggregator, or amplification software, like Hootsuite. Mr. Kazemi found many bots tweeting about Covid-19, including neighborhood health clinics using marketing software to post daily pandemic P.S.A.s about washing your hands.

He also found that humans were often mistaken for bots. Consider the “grandpa effect,” as he called it: people who were mistaken for bots because they used social media in “uncool or gauche” ways, he said. Users fond of hitting the share button on news articles also resulted in false positives. This led Mr. Kazemi to wonder whether Botometer should be renamed “Normiemeter.” He tweeted: “Can you imagine the headlines? ‘50% of accounts tweeting about Covid are normies.’”

There was also normal fandom behavior, such as the progressive K-pop fans who overwhelm social media algorithms to get topics trending — they rallied around the Black Lives Matter movement. There were burner accounts of people engaging with porn and following lots of accounts, with few or zero followers. And there was a black South-African woman who liked to respond with walls of congratulatory emojis whenever she saw other black women succeeding in their careers.

One morning on Twitter, Mr. Kazemi put out a call for bot sightings, and he asked people what made them think they had spotted a bot. About half the respondents cited the Twitter handles with multi-digit suffixes, like @Darius98302127. But as Mr. Kazemi himself recently learned, new users (since at least late 2017) are not initially given the option of choosing a username; they are automatically assigned a numerically original handle, which many don’t bother to change. For the other respondents, the term “bot” was a slur — shorthand for, “I don’t agree, and I think this position that the other person holds is so outrageous that it couldn’t possibly be held in good faith by a human.”

Do bots matter?

The problem of what is or what is not a bot may be too slippery to solve — in part because bots are continually evolving. As Mr. Kazemi noted, “It’s a bit like when Supreme Court Justice Potter Stewart famously said of pornography, ‘I know it when I see it’” — which, Mr. Kazemi added, is not an ideal strategy.

The more important and perhaps even more difficult issue is how to measure the impact of bots on the collective discourse. Do bots change our beliefs and behaviors?

“We want to understand what type of susceptible populations engage with them and what types of narratives resonate,” said Emilio Ferrara, a computer scientist at the University of Southern California and the author of the “Covid-19 Conspiracies” bots paper. The holy grail of bot research, he said, is to understand whether bots matter.

“Many people would agree that, yeah, maybe there are tons of bots,” he said. “But if nobody cares about them — maybe they get suspended right away and not a large share of the audience sees their content — it’s less problematic.”

Sarah Jackson, an associate professor at the Annenberg School for Communication at the University of Pennsylvania, said that it was more important to focus on where the bots are in networks and with whom they interact. Dr. Jackson is a co-author, with Moya Bailey and Brooke Foucault Welles of Northeastern University, of the book, “#HashtagActivism, Networks of Race and Gender Justice.” Studying dozens of #BlackLivesMatter networks, the authors found that spam and delegitimizing bots were almost always on the periphery, interacting with very few real people.

“So, even if there are a lot of bots in a network, it is misleading to suggest they are leading the conversation or influencing real people who are tweeting in those same networks,” Dr. Jackson said.

But bots have also been adopted by organizations and activists in social movements as effective vehicles for catalyzing change. Dr. Jackson pointed out that bot-detection algorithms flag what might be considered atypical human behavior: People don’t typically tweet 24 hours a day, or 1,000 times an hour, or create new accounts only to delete them once they amass a following. “But these are all normal and expected behaviors for people documenting protest activities,” she said.

And as Mr. Kazemi observed in one of his threads describing another class of false positives: “You know who uses Twitter in a way that the vast majority of people who hold Ph.D.s do not? Disenfranchised populations.”

Meanwhile, the self-identifying “Galaxy Brain Bot” — his favorite bot of 2020 — scores a mere 1.8 on Botometer.