Researchers from Knowledge, Information, and The Arabic Book (KITAB), a project to create digital tools to analyze Arabic writing, have released a dataset of more than 4,000 Arabic texts to help construct the first machine-readable corpus of premodern Islamicate texts. The texts include work from nearly 2,000 authors and contain more than a billion words combined. Researchers can use this dataset to develop algorithms that can identify relationships between ideas within Arabic texts.
Image: Wellcome Images