Researchers at Stanford University and the University of California, Berkeley have built a dataset called RoboReward that can help train AI models that grade robots on how well they perform tasks. The dataset contains videos of robotic arms attempting simple manipulation tasks such as opening drawers or picking up and placing objects, along with written descriptions of the intended task and human-assigned performance scores. Researchers use this dataset to train vision-language reward models to predict those scores from videos of robots performing tasks. This means robots no longer need humans to watch and label every training attempt and can instead use automated feedback to improve their behavior over repeated trials.
Image credits: Getty Images
