Training Vision-Language Models

by Morgan Stevens
A street crossing in Guangzhou, China

Researchers at Sun Yat-sen University in Guangzhou, China and Huawei Noah’s Ark Lab, an international AI research organization associated with Huawei Technologies, have created a dataset of 100 million images depicting common scenes such as a soccer game and vaccine screenings paired with text descriptions in Chinese and English. Researchers can use the dataset to train vision-language models in Chinese. 

Image credit: Flickr user Mathias Apitz (München)

