How Metadata From Encrypted Messages Can Keep Everyone Safer

The future is encrypted. Real-time, encrypted chat apps like Signal and WhatsApp, and messaging apps like Telegram, WeChat, and Messenger—used by two out of five people worldwide—help safeguard privacy and facilitate our rights to organize, speak freely, and keep close contact with our communities.

They are intentionally built for convenience and speed, for person-to-person communication as well as large group connections. Yet it is these same conditions that have fueled abusive and illegal behavior, disinformation and hate speech, and hoaxes and scams; all to the detriment of the vast majority of their users. As early as 2018, investigative reports have explored the role that these very features played in dozens of deaths in India and Indonesia as well as elections in Nigeria and Brazil. The ease with which users can forward messages without verifying their accuracy means disinformation can spread quickly, secretly, and at significant scale. Some apps allow extremely large groups—up to 200,000—or have played host to organized encrypted propaganda machinery, breaking away from the original vision to emulate a “living room.” And some platforms have proposed profit-driven policy changes, allowing business users to leverage customer data in new and invasive ways, which ultimately erode privacy.

In response to the harms that these apps have enabled, prominent governments have urged platforms to implement so-called backdoors or employ client-side automated scans of messages. But such solutions erode everyone’s basic liberties and put many users at greater risk, as many have pointed out. These violating measures and other traditional moderation solutions that depend on access to content are rarely effective for combating online abuse, as shown in recent research by Stanford University’s Riana Pfefferkorn.

Product design changes, not backdoors, are key to reconciling the competing uses and misuses of encrypted messaging. While the content of individual messages can be harmful, it is the scale and virality of allowing them to spread that presents the real challenge by turning sets of harmful messages into a groundswell of debilitating societal forces. Already, researchers and advocates have analyzed how changes like forwarding limits, better labeling, and reducing group sizes could dramatically reduce the spread and severity of problematic content, organized propaganda, and criminal behavior. However, such work is done using workarounds such as tiplines and public groups. Without good datasets from platforms, audits of any real-world effectiveness of such changes is hampered.

The platforms could do a lot more. In order for such important product changes to become more effective, they need to share the “metadata of the metadata” with researchers. This comprises aggregated datasets showing how many users a platform has, where accounts are created and when, how information travels, which types of messages and format-types are fastest to spread, which messages are commonly reported, and how (and when) users are booted off. To be clear, this is not information that is typically referred to as “metadata,” which normally refers to information about any specific individual and can be deeply personal to users, such as one’s name, email address, mobile number, close contacts, and even payment information. It is important to protect the privacy of this type of personal metadata, which is why the United Nations Office of the High Commissioner for Human Rights rightly considers a user’s metadata to be covered by the right to privacy when applied to the online space.

Luckily, we do not need this level or type of data to start seriously addressing harms. Instead, companies must first be forthcoming to researchers and regulators about the nature and extent of the metadata they do collect, with whom they share such data, and how they analyze it to influence product design and revenue model choices. We know for certain that many private messaging platforms collect troves of information that include tremendous insights useful to both how they design and trial new product features, or when enticing investment and advertisers.

The aggregated, anonymized data they collect can, without compromising encryption and privacy, be used by platforms and researchers alike to shed light on important patterns. Such aggregated metadata could lead to game-changing trust and safety improvements through better features and design choices.

As it currently stands, platforms have not shown the willingness to voluntarily share in a way that invites scrutiny and builds trust with researchers and civil society. Most companies operating these messaging services don’t even share basic information around market size or new account creation. For example, though Facebook/WhatsApp did share internal results that forwarding limits and labeling significantly tamped down the virality of misinformation, at that time they declined to share more nuanced internal analysis that suggested that the proportions of misinformation rose sharply when there was more than one reshare. Publicly sharing such analysis earlier on would have improved WhatsApp’s track record of transparency and effective solutions and, at the same time, encouraged other players to implement similar design features.

Similar efforts are possible in other areas. For example, in addition to adding friction by labeling messages or reducing virality by restricting forwards, we must evaluate which types of harms are more effectively addressed through mechanisms that depend on access to content versus those that do not, or whether user reporting can be adopted at scale and to what effect. These are all feature and design changes that companies would be able to predict, pilot, and assess only via the metadata they collect—information that is currently exclusive to their eyes alone.

Distributing power that is exclusively held today by a few influential technology companies to a wider group of stakeholders, including nonprofits, researchers, regulators, and investors, is the only way society will be able to scrutinize the problems at a deeper level, leading to more viable solutions. And by requiring transparency and prioritizing better design features, we can establish guardrails and best practices that help make all platforms more trustworthy.

We don’t have to choose between privacy and safety. Companies see safety measures, transparency, and friction as conflicting with growth, but that is a false dilemma. If these companies had the will, they could find a way to make their platforms safer and more trustworthy—and it starts with sharing critical information with external stakeholders.


More Great WIRED Stories