Researchers working for Wikimedia’s Wikipedia Detox project, which focuses on reducing the impact of harassment and attacks on the Wikipedia editor community, have published a dataset of more than 100,000 comments from English-language Wikipedia pages, annotated with information about whether or not a comment included a personal attack. The researchers collected the data to help develop methods that combine crowdsourced analysis and machine learning to automatically detect personal attacks on the site.
Image: Wikimedia.