What counts as offensive is subjective: something one person finds harmless can upset another, depending on their background or experiences. Online, this is even more noticeable because people from all over the world interact instantaneously. Debates about political correctness and so-called cancel culture reflect our attempt to balance free speech with responsibility. Being aware and empathic, woke, does not mean avoiding all offence, but it does mean recognising how words and actions might affect different communities or people differently depending on context. Improved social intelligence can lead to better conversations, reduce unnecessary conflict, and build stronger ties between us.
Research in the International Journal of Computational Science and Engineering has looked at how we might automate the detection of offensive content on social media, presenting a method capable of working across more than sixty languages without requiring extensive pre-labelled datasets. The research aims to help platforms manage posts that are truly harmful or represent harassment or abuse, and so improve trust and safety for all users.
The work builds on a multilingual system that can represent text using concepts drawn from Wikipedia articles, allowing posts to be categorised based on meaning rather than language alone. This technique, known as randomized explicit semantic analysis, can then create a vector of weighted concepts for each message, enabling a single annotated dataset in one language to support classification across dozens of others.
To improve accuracy, the researchers introduced a hybrid meta-heuristic algorithm, a type of trial and error approach, that combines a statistical approach known as an adaptive Markov chain Monte Carlo tree search with an optimisation method called the enhanced eagle Aquila optimiser. This combined effort identifies the most effective configurations for categorising content. In tests, it consistently matched or even surpassed current methods when presented with publicly available datasets of offensive social media posts.
The approach also hooked into content-based signals, such as specific words or phrases, behavioural cues, such as posting patterns and metadata, as well account information and timestamps to categorise content more effectively. With such a system in place, social media platforms might be able to refine their moderation systems and focus resources more effectively on tackling content that is broadly deemed as abusive or likely to lead to greater conflict between users.
Aarthi, B. and Chelliah, B.J. (2025) ‘Multilingual language classification model for offensive comments categorisation in social media using HAMMC tree search with enhanced optimisation technique’, Int. J. Computational Science and Engineering, Vol. 28, No. 5, pp.498–514.
No comments:
Post a Comment