Darwin and Conflict

Counterspeech:
Can and should we automate it?

Stefanie Ullmann is an Associate Member of Darwin College and a Postdoctoral Research Associate on the project Giving Voice to Digital Democracies: The Social Impact of Artificially Intelligent Communications Technology. She has a PhD in Linguistics, which focussed on a study of media and political coverage of the 2011 Arab Spring revolutions. Her recent work has focused on quarantining online hate speech, misinformation, gender bias in machine translation, dynamic data statements and counterspeech approaches to fighting hate speech.

Freedom of Speech Typewriter. Photo Credit: Markus Winkler on Unsplash

As harmful language continues to taint online communication, researchers and practitioners are acquiring new and more effective ways of combatting it. The documents revealed by Facebook whistle blower Frances Haugen in October 2021 have only reinforced the already persistent impression that Facebook’s and other social media company’s current approaches to online hate speech are insufficient and continue to fail their users.

To date, social media companies still follow a reactionary approach to handling harmful content. Once reported, content is moderated and, if it violates the company’s guidelines, taken down and the user may be blocked. This approach has frequently been the subject of controversial public discussion due to several reasons. Definitions of hate speech and guidelines vary from company to company, deleting content is equivalent to censorship and a threat to freedom of speech, what is more human moderators are often employed by subcontractors and can work under unethical and unsafe conditions.

While some countries, such as Germany and South Africa, have enforced new laws and bills to combat hate speech and hold companies responsible, others have issued White Papers (e.g., the UK’s Online Harms White Paper) and transnational campaigns (e.g., the European Commission’s Code of Conduct). Meanwhile, lawyers and free speech advocates continue to emphasise the potential dangers of suppressing any unwanted speech to our democratic values and call for counterspeech as only permissible option (Strossen 2018; Stern 2020).

More recently, the Dangerous Speech Project has defined Counterspeech as “any direct response to hateful or harmful speech which seeks to undermine it.” Counterspeech responses can range from spontaneous, organic responses to organised counter-messaging campaigns (e.g., #jagärhär or #iamhere).

Benesch et al. 2016 distinguish eight Counterspeech strategies that have shown to be successful: Presenting facts to correct misstatements or misconceptions, pointing out hypocrisy or contradictions, warning of offline or online consequences, affiliation, denouncing hateful or dangerous speech, visual communication, humour and using an empathetic tone. In contrast, using a hostile or aggressive tone and silencing have been found to cause a backfire effect and contribute to a further escalation of hateful rhetoric.

One of the key questions researchers have been seeking to answer in recent years is: Can Counterspeech change the behaviour of hateful speakers?

Studies have shown that changing someone’s behaviour is a challenging task and can most likely be achieved in one-to-one or many-to-one conversations (Benesch et al. 2016; Saltman & Russell 2014; Briggs & Feve 2013; Saltman, Kooti & Vockery 2021). One of the most prominent examples is Megan Phelps-Roper. A former member of Westboro Baptist Church, she eventually left the church after following discussions on Twitter which made her begin to question the hateful ideology of her family. However, to truly change someone’s mind or behaviour is an intense and laborious task and only occasionally successful.

A much more important intention and motivation behind Counterspeech seems to be to reach cyber-bystanders and set off a kind of contagion effect. Research shows that users are less likely to use hateful speech after observing previous hate speech being “moderately censored” (Álvarez-Benjumea & Winter 2018). Further studies also suggest that organised movements tend to be more effective than striking out on one’s own (Garland et al. 2020). In the end, as rewarding as it may be to be an active counter speaker, it can equally be lonely and dispiriting work (Benesch & Manion 2019).

The question has been raised, whether technology, specifically natural language processing (NLP) systems, may be developed and employed to reduce the burden on the human individual and perform Counterspeech online?

An automated Counterspeech system essentially consists of two parts: one component needs to be able to detect hate speech, while another is responsible for composing an appropriate Counterspeech response. While a plethora of studies exist on the detection of hate speech, research on automatic detection, let alone generation, of Counterspeech is still in its infancy.

This is largely due to a number of challenges including the difficulty and subjectivity of automated identification (Kennedy et al. 2017), the wide range of potential communicative strategies (Wright et al. 2017) and the requirement of manual coding/annotating (Mathew et al. 2019). Furthermore, the number of available datasets is still limited. Compared to the already challenging task of automated hate speech detection, understanding and recognising Counterspeech requires successful processing of longer conversations and interactions between users.

Finally, one chief question remains, be it for human-human or human-computer conversations: What is a good Counter response?

The answer to this is highly situational and depends on a number of social and contextual factors. Still, one notable study and Counterspeech dataset by Chung et al. (2019) is multilingual (English, French, Italian) and consists of over 14,000 hate/ counterspeech pairs. The counter narratives, as the authors call them, were originally provided by NGO staff. Studies like this are of tremendous importance, especially because the more high quality datasets are available, the better systems can be trained, leading to more accurate results.

Fully automated Counterspeech generation is not (yet) possible, and the linguistic complexities of the task are likely to remain an obstacle for artificial intelligence. However, we can use technology to assist users and human counter speakers. This could be in the form of an app, a browser extension, or an online interface. Be it a hate speech detector that then suggests a suitable response, or a system that offers examples of replies based on a taxonomy of Counterspeech types (presenting facts, pointing out hypocrisy, warning of consequences), a (semi-)automated system that provides prompts for the user to generate Counterspeech could be key in helping individuals overcome the initial hurdle of thinking of a response. Further work is needed to improve the quality of training datasets, annotation, and evaluation so that both humans and machines can better understand and, subsequently, generate effective Counterspeech.

Hate Speech Illustration. Photo Credit: Mika Baumeister on Unsplash

Counter Speech Button. Photo Credit: Adobe Stock

Counter Speech Warning Signs. Photo Credit: Mika Baumeister on Unsplash

If you enjoyed this article you may also be interested in:

Filippo Grandi. Darwin College Lecture Series 2018

Filippo Grandi. Darwin College Lecture Series 2018

Refugees and Migration: Drawing on more than 30 years of experience in international affairs, UN Commissioner for Refugees Filippo Grandi reflects on how, in a world of modern nation states, shared prosperity, and boundless capabilities can refugees and migrants on the move today find themselves exposed to kidnapping for ransom, imprisonment and torture? And what is the role of international cooperation and the modern system of international refugee protection?

Lyse Doucet. Darwin Lecture Series 2017

Lyse Doucet. Darwin Lecture Series 2017

Reporting from Extreme Environments: Lyse Doucet is the BBC ’s award winning Chief International Correspondent who spends much of her time covering stories in our news headlines including devastating wars in Syria and Iraq as well as Afghanistan. She often focuses on the human costs of conflict.

David Runciman. Darwin Lecture Series 2017

David Runciman. Darwin Lecture Series 2017

Dealing with Extremism: Many extremist ideologies rely heavily on conspiracy theories to explain how the world works and where power lies. This lecture explores our understanding of conspiracy theories – where they come from, how they work, who believes in them – and what they can tell us about dealing with extremism.

2017 Lecture Series book curated by Martin Jones and Andy Fabian

2017 Lecture Series book curated by Martin Jones and Andy Fabian

Conflict, sadly, is part of our everyday life; experienced at home, in the workplace, on our TV screens. But is it an inevitable part of the fabric of our existence? In this volume, eight experts examine conflict at many levels, from the workings of genes to the evolution of galaxies.