Chinese technology company Tencent has released Tencent ML-Images, a dataset containing 18 million images across 11,000 categories. The dataset combines images from previously released datasets, removing images labeled in abstract categories like “event” or “summer” and placing other images into more fine grained categories, such as separating images of dogs into categories based on breed. On average, there are nearly 1,450 images per category. In addition, Tencent ML-Images has an average of eight labels per image—many image datasets contain images with only a single label, which can waste useful visual information to train models on because a single label often cannot describe all important objects in an image.
Image: Tencent