Improving Translation Models

by Morgan Stevens May 24, 2023

written by Morgan Stevens May 24, 2023

Researchers at Google, Stanford University, and Queen Mary University of London have created a dataset to improve translation models. The dataset contains 20 hours of recorded audio featuring English speakers in India, Nigeria, and the United States participating in over 3,600 image or word guessing games, as well as transcriptions of the conversations that contain 200,000 words. Researchers can use the dataset to train translation models to understand different dialects of English.

Get the data.

Image credit: Flickr user Jackson Lanier

Morgan Stevens

Morgan Stevens is a Research Assistant at the Center for Data Innovation. She holds a J.D. from the Sandra Day O'Connor College of Law at Arizona State University and a B.A. in Economics and Government from the University of Texas at Austin.

Improving Translation Models

Visualizing Rent Burdens

10 Bits: The Data News Hotlist

You may also like