The growth of online hate speech is a disturbing, growing trend in countries across the world, with serious emotional effects and the potential to affect, and even contribute to, real life violence. Citizen-generated counter speech might help discourage hateful online rhetoric, but it has been hard to measure and study. Until recently, studies are limited to small, hand-labeled endeavors.
A brand new paper printed in the event of this 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) provides a framework for studying the dynamics of internet hate speech and counter language. The paper delivers the initial large-scale classification of millions of such interactions on Twitter. The authors developed a learning algorithm to assess data from a unique position on German Twitter, as well as the findings suggest that organized movements to counteract hate speech on social websites are more powerful than individuals striking out by themselves.
The writers will present their paper, “Countering hate on social media: Large-scale classifications of hate and counter speech” during the Workshop on Online Abuse and Harms, that is running in conjunction with EMNLP 2020.
“I’ve seen this big shift in civil discourse in the last two or three years towards being much more hateful and much more polarized,” says Joshua Garland, a mathematician and Applied Complexity Fellow at the Santa Fe Institute. “So, for me, an interesting question was: what’s an appropriate response when you’re being cyber-bullied or when you’re receiving hate speech online? Do you respond? Do you try to get your friends to help protect you? Do you just block the person?”
To examine such questions scientifically, researchers must first have access to a wealth of real data on both the hate language and counter-speech, and the ability to differentiate between both. That data was released, and Garland and collaborator Keyan Ghazi-Zahedi at the Max Planck Institute in Germany discovered it in a five-year interaction that performed over German Twitter: As an alt-right group took to the stage together with hate speech, an organized movement grew up to counter it.
“The beauty of these two groups is they were self-labeling,” explains Mirta Galesic, the group’s social scientist and a professor of human social dynamics at SFI. She says investigators who research counter-speech usually need to employ countless students to hand-code tens of thousands of articles. However, Garland and Ghazi-Zahedi managed to input the self-labeled articles to a machine-learning algorithm to automate large swaths of the classification. The team also relied on 20-30 individual coders to check the machine classifications paired up with intuition about what ignites as hate speech and counter-speech.
The end result has been a dataset of unprecedented dimension that allows the investigators to analyze not just isolated instances of counter and hate speech, but additionally compare long-running interactions between them both.
The group collected one dataset of millions of tweets posted by members of the two groups, using these self-identified tweets to educate their classification algorithm to recognize hate and counter speech. Then, they applied their algorithm to study the dynamics of some 200,000 discussions that happened between 2013 and 2018. The authors plan to shortly publish a followup paper examining the dynamics shown by their algorithm.
“Now we can resolve a massive data set from 2016 to 2018 to see how the proportion of hate and counter-speech changed over time, who gets more likes, who is retweeted, and how they replied to each other” Galesic says.
The researchers have been in the process of assessing tactics for the two groups and following wider questions such as if certain counter-speech approaches are more powerful than others.
“What I’m hoping is that we can come up with a rigorous social theory that tells people how to counter hate in a productive way that’s non-polarizing,” Garland says, “and bring the Internet back to civil discourse.”