Twitter’s ‘Safety Mode’ to Clamp Down on Abusive Tweets; Accounts Will be Auto-blocked for 7 Days

    Twitter is amping up their surveillance on abuse and harassment.

    Twitter is testing out a feature called Safety Mode to help them put an end to abuse and trolling—both of which have become challenges for the platform. Once enabled, Safety Mode will work automatically to flag accounts tweeting “spammy or abusive replies”. Upon identification, these accounts will be blocked for seven days. The company said that accounts that are mistakenly blocked can be unblocked as well.

    The mode is initially launching in a beta version for a small number of users on iOS, Android, and, before being made available to everyone. Other recently introduced features on Twitter—hiding replies, allowing users to limit who can reply to their posts, and giving a warning before a user tweets a potentially harmful reply—also focus on safety.

    “When the feature is turned on in your Settings, our systems will assess the likelihood of a negative engagement by considering both the Tweet’s content and the relationship between the Tweet author and replier,” reads the blog post by Jarrod Doherty, Twitter’s senior product manager.

    Even though Twitter is trying to fight back abuse via the new features, online harassment is spreading like wildfire on the platform, as seen recently in the overflow of racist abuses and slurs made against Black players on England’s football team following their Euro 2020 championship loss.

    The head of Twitter UK Public Policy Katy Minshall said: “While we have made strides in giving people greater control over their safety experience on Twitter, there is always more to be done.
    We’re introducing Safety Mode; a feature that allows you to automatically reduce disruptive interactions on Twitter, which in turn improves the health of the public conversation.”

    Safety Mode can be switched on in the app settings, following which the system will evaluate the tweet’s content as well as the author and replier’s relationship. If an account is followed by the user or they have been frequently interacting with them, only then it will not be blocked automatically.

    Just like other social media platforms, Twitter uses a blend of automated and human moderation. Although it has never formally disclosed the number of human moderators at work, a 2020 report by NYU Stern, New York business school, suggests that Twitter had roughly 1,500 moderators monitoring the 199 million Twitter users worldwide each day.

    A recent study on hate speech—produced by Facts Against Hate on behalf of the Finnish government—found Twitter to be “the worst of the tech giants” when it came to hate speech. According to study author Dr Mari-Sanna Paukkeri, the solution lies in using artificial intelligence systems that have been trained by humans to detect hate speech on the platform. 

    “There are so many different ways to say bad things, and it is rocket science to build tools that can spot these,” Dr Paukkeri said, adding that by highlighting only some words or phrases—an approach used by many social networks—is not enough.


    author avatar
    Sana Naaz




    Please enter your comment!
    Please enter your name here