Mozilla has published the latest dataset from its Common Voice project, which aims to spur the development of voice-enabled technologies. The dataset consists of nearly 1,400 hours of recordings from 42,000 individuals speaking a total of 18 different languages. In addition, the dataset includes labels such as the age, sex, and accent of contributors who opted in to provide the metadata.
Image: DPic