Yahoo researchers built a powerful new online abuse detector
All it took was a couple thousand offensive comments and some machine learning.
A team of researchers at Yahoo Labs have plumbed the depths of their company's massive comment sections to come up with something that might actually be useful for detecting and eventually curbing rampant online abuse. Using a first-of-its-kind data set built from offensive article comments flagged by Yahoo editors, the research team was able to develop an algorithm that, according Technology Review, is the best automated abuse filter built to date.
Most current abuse filters rely on a combination of blacklisted terms, common expressions and syntax clues to catch hate speech online, but the Yahoo team went a step further and applied machine learning to their massive repository of flagged comments. Using a technique called "word embedding," which processes words as vectors rather than either simply positive or negative, the Yahoo system can recognize an offensive string of words, even if the individual words are inoffensive on their own. According to the their findings, the system was able to correctly identify abusive language from the same data set about 90 percent of the time. While that figure is impressive, the ever-changing nature of hate speech means no system -- not even a human one -- will ever truly be able to know what's offensive 100 percent of the time.
As Alex Krasodomski-Jones, an online abuse researcher with the UK's Centre for Analysis of Social Media, told Technology Review, "Given 10 tweets, a group of humans will rarely all agree on which ones should be classed as abusive, so you can imagine how difficult it would be for a computer."