Home BlogDataset Helping AI Systems Understand Different Languages

Helping AI Systems Understand Different Languages

by Michael McLaughlin
by
Question marks net to light bulbs.

Google has released a dataset of more than 200,000 question-answer pairs from 11 languages to advance the development of AI systems that can understand the different ways languages express meaning. For example, while English will often add an “s” to a word to signify plurality, Arabic uses an entirely different word to indicate if there are multiple of something. To reduce machine learning systems’ reliance on word matching to answer a question, the researchers collected questions by having individuals read Wikipedia articles and then ask a question that the text did not answer. The answers are from separate Wikipedia articles. 

Get the data.

Image: mohamed_hassan

You may also like

Show Buttons
Hide Buttons