alt:
"Bluesky
Toxicity detection experiments
Addressing toxicity is one of the biggest challenges on social media. On Bluesky, the two areas that made up 50% of user reports in the past quarter are for content that is rude and for accounts that are fake, scams, or spam. Rude content especially can drive people away from forming connections, posting, or engaging for fear of attacks and dogpiles.
In our first experiment, we are attempting to detect toxicity in replies, since user reports indicate that is where they experience the most harm. We'll be detecting rude replies, and surfacing them to mods, then eventually reducing their visibility in the app.
Repeated rude labels on content will lead to account level labels, and suspensions. This will be a building block for detecting group harassment and dog-piling of accounts.
Automating spam and fake account removals
Harm on social media can happen quickly."