“We essentially gathered hateful tweets and used language processing to find the other terms that were associated with such messages… We learned these terms and used them as the bridge to new terms—as long as we have those words, we have a link to anything they can come up with.” This defeats attempts to conceal racist slurs using codes by targeting the language that makes up the cultural matrix from which the hate emerges, instead of just seeking out keywords. Even if the specific slurs used by racists change in order to escape automated comment moderation, the other terms they use to identify themselves and their communities likely won’t.
There are a few things I thought are worth noting:
- The developers of this algorithm used Tweets to identify the hateful language, which says something about the general quality of discourse on Twitter.
- The algorithm isn’t simply substituting one set of keywords with another; it identifies the context of the sentence in order to determine if the sentiment is hateful. The specific words almost don’t matter. This is a significant step in natural language processing.
- The post appeared in 2017 so it’s a year old and I haven’t looked to see what (if any) progress has been made since then.