Training Vision-Language Models

by Morgan Stevens March 31, 2022

written by Morgan Stevens March 31, 2022

Researchers at Sun Yat-sen University in Guangzhou, China and Huawei Noah’s Ark Lab, an international AI research organization associated with Huawei Technologies, have created a dataset of 100 million images depicting common scenes such as a soccer game and vaccine screenings paired with text descriptions in Chinese and English. Researchers can use the dataset to train vision-language models in Chinese.

Get the data.

Image credit: Flickr user Mathias Apitz (München)

Morgan Stevens

Morgan Stevens is a Research Assistant at the Center for Data Innovation. She holds a J.D. from the Sandra Day O'Connor College of Law at Arizona State University and a B.A. in Economics and Government from the University of Texas at Austin.

Training Vision-Language Models

Visualizing the Science Behind Nuclear Weapons

5 Q’s for Peter Herr, Director of Product at Base Operations

You may also like