General

Algorithm That Identifies Misogynist Content On Twitter

NewsGram Desk

A team of researchers has developed a sophisticated algorithm to detect harmful and abusive posts against women on Twitter that cuts through the rabble of millions of tweets to identify misogynistic content. Online abuse targeting women, including threats of harm or sexual violence, has proliferated across all social media platforms.

Now, researchers from Queensland University of Technology (QUT) have developed a statistical model to help drum it out of the Twitter. The team mined a dataset of 1 million tweets then refined these by searching for those containing one of three abusive keywords – whore, slut, and rape.

The team's model identified misogynistic content with 75 per cent accuracy, outperforming other methods that investigate similar aspects of social media language.

"At the moment, the onus is on the user to report abuse they receive. We hope our machine-learning solution can be adopted by social media platforms to automatically identify and report this content to protect women and other user groups online," said Associate Professor Richi Nayak.

Only 550 million people have ever sent a tweet. Unsplash

The key challenge in misogynistic tweet detection is understanding the context of a tweet. The complex and noisy nature of tweets makes it difficult.

On top of that, teaching a machine to understand natural language is one of the more complicated ends of data science as language changes and evolves constantly, and much of meaning depends on context and tone.

"So, we developed a text mining system where the algorithm learns the language as it goes, first by developing a base-level understanding then augmenting that knowledge with both tweet-specific and abusive language," she noted.

The team implemented a deep learning algorithm called 'Long Short-Term Memory with Transfer Learning', which means that the machine could look back at its previous understanding of terminology and change the model as it goes, learning and developing its contextual and semantic understanding over time."

A day's worth of tweets would fill a 10 million page book. Unsplash

"Take the phrase 'get back to the kitchen' as an example – devoid of context of structural inequality, a machine's literal interpretation could miss the misogynistic meaning," Nayak said.

"But seen with the understanding of what constitutes abusive or misogynistic language, it can be identified as a misogynistic tweet". Other methods based on word distribution or occurrence patterns identify abusive or misogynistic terminology, but the presence of a word by itself doesn't necessarily correlate with intent, said the paper, published in the journal Springer Nature.

"Once we had refined the 1 million twitter tweets to 5,000, those tweets were then categorised as misogynistic or not based on context and intent, and were input to the machine learning classifier, which used these labelled samples to begin to build its classification model," Nayak informed.

The team hoped the research could translate into platform-level policy that would see Twitter, for example, remove any tweets identified by the algorithm as misogynistic.

"This modelling could also be expanded upon and used in other contexts in the future, such as identifying racism, homophobia, or abuse toward people with disabilities," Nayak said. (IANS)

Book Your Airport Taxi Limo Service Today for a Smooth and Stylish Arrival

American Children Who Appear to Recall Past-Life Memories Grow Up to Be Well-Adjusted Adults

In the ‘Wild West’ of AI Chatbots, Subtle Biases Related to Race and Caste Often Go Unchecked

Future of Education with Neuro-Symbolic AI Agents in Self-Improving Adaptive Instructional Systems

Lower turkey costs set table for cheaper US Thanksgiving feast this year