Home BlogDataset Translating Classical Japanese Literature

Translating Classical Japanese Literature

by Michael McLaughlin
Kuzushiji

A combination of researchers from the Japanese government, academia, research institutes, and Google have published three datasets of Japanese script to preserve Japanese cultural knowledge. The datasets contain nearly 500,000 images of characters from the classical Japanese cursive script Kuzushiji, which most Japanese natives cannot read because the writing style is no longer a part of the official school curriculum. The researchers classified the images by their 4,000 modern equivalent characters. Millions of classical Japanese books use Kuzushiji characters, and this dataset could promote the development of machine learning algorithms that can translate Kuzushiji to the modern Japanese writing system.

Get the data.

Image: mxbi

You may also like

Show Buttons
Hide Buttons