Google, working with African research institutions, has created an open speech dataset for African languages called WAXAL. The dataset includes speech data for 21 Sub-Saharan African languages, such as Hausa, Igbo, and Swahili. It also contains over 11,000 hours of speech from nearly 2 million recordings, including 1,250 hours of transcribed everyday speech, and over 20 hours of clear, studio‑quality voice recordings used to help build text‑to‑speech systems.
