Fans Create Database of Over 200,000 Jeopardy Questions

by Travis Korte August 14, 2014

written by Travis Korte August 14, 2014

Jeopardy host Alex Trebek poses with a winner.

Reddit users have created a machine-readable data set of over 200,000 Jeopardy questions. The data, which the dataset’s creators scraped from fan-created question repository J!-Archive, contains each question’s answer, along with category, dollar value, air date, and other data. One analysis using the data set showed how diverse Jeopardy’s question categories are: the 100 most commonly used categories span only 11 percent of total questions asked. The creator of that analysis noted that this extreme amount of variation “has given me a lot of sympathy for IBM’s Jeopardy!-playing robot Watson.”

Get the data.

Photo: Queen’s University

Travis Korte

Travis Korte is a research analyst at the Center for Data Innovation specializing in data science applications and open data. He has a background in journalism, computer science and statistics. Prior to joining the Center for Data Innovation, he launched the Science vertical of The Huffington Post and served as its Associate Editor, covering a wide range of science and technology topics. He has worked on data science projects with HuffPost and other organizations. Before this, he graduated with highest honors from the University of California, Berkeley, having studied critical theory and completed coursework in computer science and economics. His research interests are in computational social science and using data to engage with complex social systems. You can follow him on Twitter @traviskorte.

Fans Create Database of Over 200,000 Jeopardy Questions

What Colleges Does Your School Compare Itself To?

10 Bits: The Data News Hot List

You may also like