Training Small Language Models

by Morgan Stevens May 19, 2023

written by Morgan Stevens May 19, 2023

Researchers at Microsoft have created a dataset of short stories to better train small language models, which require less computing power and resources than larger models. The team used GPT-3.5 and GPT-4 to generate short, simple stories that only use words a three to four year old child could understand. They then used the dataset to train small language models to produce stories comparable to stories produced by larger language models.

Get the data.

Image credit: Flickr user David Masters

Morgan Stevens

Morgan Stevens is a Research Assistant at the Center for Data Innovation. She holds a J.D. from the Sandra Day O'Connor College of Law at Arizona State University and a B.A. in Economics and Government from the University of Texas at Austin.

Training Small Language Models

5 Q’s for Martin French, Chief Data Technology Officer of the Apex Group

Visualizing Eurovision Songs

You may also like