Researchers from Stanford University have created a data set containing information about how different terms and concepts co-occur in 20 million clinical notes and patient narratives spanning 19 years. Using this data set, scientists can estimate the probability that a patient with a certain condition will take a certain drug or use a certain device. The data set’s creators hope it will help researchers in a wide range of applications, including outcome prediction for medical treatments and analyzing patterns of comorbidity, when patients have two or more conditions.