Researchers at the University of California, Berkeley, the Georgia Institute of Technology, and Harvard University have created a dataset of prompt injection attacks, which are third party attempts to maliciously manipulate the results of a large language model. It contains over 126,000 prompt injection attacks as well as over 46,000 defenses against attacks. Researchers can use the dataset to strengthen large language models’ defenses.
Image credit: Flickr user Christiaan Colen