Google has released a dataset of more than 200,000 question-answer pairs from 11 languages to advance the development of AI systems that can understand the different ways languages express meaning. For example, while English will often add an “s” to a word to signify plurality, Arabic uses an entirely different word to indicate if there are multiple of something. To reduce machine learning systems’ reliance on word matching to answer a question, the researchers collected questions by having individuals read Wikipedia articles and then ask a question that the text did not answer. The answers are from separate Wikipedia articles.
Image: mohamed_hassan