The Center for Data Innovation spoke with Ky Harlin, vice president of growth and data science at Condé Nast, a mass media company based in New York City responsible for publications such as Vanity Fair, Epicurious, and Wired. Harlin discussed the role of data science in generating viral content and how data scientists can be valuable assets to media companies.
This interview has been lightly edited.
Joshua New: Prior to your current role, you were director of data science at BuzzFeed, a website known for its viral content. How is a data scientist involved in helping content go viral?
Ky Harlin: I do not think anyone can really make something viral, as certain content simply lacks the sharability needed to really spread. However, data scientists can aid in the creation and distribution of content, in both cases aiming to increase the chances of a hit. For example, data about early interactions with content can help determine their promotion. In other words, knowing if an article is resonating with people and knowing who those people are helps determine how much to show it to users and where it will reach the right audience. Another example is analyzing social media commentary to figure out what to write about. Data scientists can mine these conversations to find topics relevant to readers at a given time, as well as who and where those readers are. These kind of things, as well as many others, are currently happening at Condé Nast, BuzzFeed, and almost all digital media companies at this point.
A common thread through these applications is the idea of a feedback mechanism. By incorporating information about past events (be they one year ago or one millisecond ago) into expectations of the future, you can narrow your efforts to the things most likely to be impactful. The role of the data scientist is, at least in part, helping to create these feedback mechanisms.
New: And prior to Buzzfeed, you worked in medical research, developing analytics methods and tools for nuclear imaging. What does this have in common with the work your work with media companies?
Harlin: Basically, both jobs involved a similar combination of math and coding. In the imaging work, we utilized mathematical techniques and data processing technologies to build software that helped researchers conduct imaging studies. In data science for media, we use many similar techniques and technologies to help build software to connect users with content as efficiently and effectively as possible. At the same time, the two jobs are very different for the obvious reason that each requires specific domain expertise. This sounds simple, but it’s a huge part of being a data scientist that I think is often undervalued. Math and coding skills alone are not enough to be effective. A deep understanding of the company and the field are essential to knowing how to approach and solve problems in a way that will be impactful.
New: You’ve talked about how analyzing the types of content people view can reveal some interesting connections—such as how people interested in Jennifer Lawrence are also interested in penguins. Can you describe how you identify these kind of relationships, and what the benefits of this kind of insight might be?
Harlin: The methodology is actually pretty simple at its core. Consider making a list of the articles a user viewed in a given time period, and then doing this for every user. By cross-referencing these lists, you can identify overlaps in the audiences of people who viewed a given set of stories. Then, by evaluating the most significant overlaps, these (sometimes strange) connections begin to emerge. At a minimum, such insights help content creators better understand their audience. Moreover, they can be used to determine which stories to recommend to a given user. This is the same basic approach employed by companies like Netflix or Amazon to make recommendations.
New: You were named one of 21 New Media Innovators by New York Magazine. Others on this list are primarily journalists, editors, or founders of new platforms. Why do you think your efforts, or data science in general, are so valuable to media companies?
Harlin: Digital content has one huge advantage over other formats: you can collect data about how people interact with it. However, it requires a certain set of skills to make that data useful. Among other things, it’s often immense and messy, which increases the difficulty of properly analyzing it. A data scientist has the right combination of skills to overcome these obstacles and unlock impactful uses of this data. Media companies realize this, and thus the job is now an essential part of many of their efforts.
New: Condé Nast is not just on online media company, as they oversee a laundry list of physical magazine publications. How does this affect your work?
Harlin: In general, we take the approach of creating content that works for the platform where it lives. We’re not terribly interested in just creating copies of the same material across different platforms. What works in a print magazine might not work in a digital edition, and vice versa. In this way, the print and digital efforts are fairly separate. The entire process—from deciding what to write about to how to present it—is approached with a different medium and audience in mind. That said, we do highly value the free exchange of research and data between the two groups. Without that, we would not have as complete an understanding of how people are consuming content today. Moreover, a lot of print subscribers also use our digital properties, which allows us to enhance their digital experience. So while print is not a main focus of my work, it definitely does affect my job.