by Michael McLaughlin
AI and Translation

Amazon has released a dataset of nearly 400,000 English, Hebrew, Russian, Arabic, and Japanese names collected from Wikipedia articles to help AI perform more accurate translations between alphabets. Differences in alphabets, such as the use of different characters and pronunciations, can affect how well AI can perform translations. For example, Amazon found its AI did better at understanding English to Russian translations than Arabic to English because the Latin alphabet is more similar to the Cyrillic alphabet than the Arabic alphabet. This data could help personal assistants retrieve information across languages.

Get the data.

