QZ AI Studio Demo: Automatically Labeling Hate Incident Tips

Many news organizations receive tips from their readers, alerting journalists to incidents or trends. But newsrooms can be opaque and tips can sometimes end up in a giant inbox... and not read by the right reporter.

We hypothesize that machine-learning can help, by learning from examples of categorized tips.

[Where else does AI fit into reporting and newsgathering? Read our first post, How you're feeling when machine learning might help..]

As an experiment, we used tips from ProPublica's Documenting Hate project. Tips submitted to that project included a free-text description of a hate incident and, optionally, a closed-class categorization of why the victim of the hate incident was targeted, like race or religion.

We trained a model to predict from that free-text description where it's left blank.

This model probably won't be directly useful to other journalists, but we hope to make the methodology clearer and easier to use -- there's nothing about it that's specific to hate crimes. It should hopefully work just as well categorizing general tips into beats, so the sports tips can be sent to the sports reporter, education tips to the schools reporter, etc.

Read more about how we did it -- or download the Jupyter Notebook


Predicted classifications:

We can also ask the model what most influenced its decision to classify the tip as a particular type of hate incident or not.