Facebook failed to stop test ads from threatening midterm election workers
The social network's automatic moderation didn't spot some obvious red flags.
Meta's election integrity efforts on Facebook may not have been as robust as claimed. Researchers at New York University's Cybersecurity for Democracy and the watchdog Global Witness have revealed that Facebook's automatic moderation system approved 15 out of 20 test ads threatening election workers ahead of last month's US midterms. The experiments were based on real threats and used "clear" language that was potentially easy to catch. In some cases, the social network even allowed ads after the wrong changes were made — the research team just had to remove profanity and fix spelling to get past initial rejections.
The investigators also tested TikTok and YouTube. Both services stopped all threats and banned the test accounts. In an earlier experiment before Brazil's election, Facebook and YouTube allowed all election misinformation sent during an initial pass, although Facebook rejected up to 50 percent in follow-up submissions.
In a statement to Engadget, a spokesperson said the ads were a "small sample" that didn't represent what users saw on platforms like Facebook. The company maintained that its ability to counter election threats "exceeds" that of rivals, but only backed the claim by pointing to quotes that illustrated the amount of resources committed to stopping violent threats, not the effectiveness of those resources.
The ads wouldn't have done damage, as the experimenters had the power to pull them before they went live. Still, the incident highlights the limitations of Meta's partial dependence on AI moderation to fight misinformation and hate speech. While the system helps Meta's human moderators cope with large amounts of content, it also risks greenlighting ads that might not be caught until they're visible to the public. That could not only let threats flourish, but invite fines from the UK and other countries that plan to penalize companies which don't quickly remove extremist content.