When British data scientist Clive Humby coined the popular maxim “data is the new oil” in 2006, people interpreted it to mean data had become the most valuable commodity in the modern economy. But while it is certainly true that data has become invaluable, the oil analogy is fundamentally flawed, and thinking of it that way risks breathing life into ill-conceived ideas for limiting how businesses collect and use data—which would limit its economic benefits.
To be sure, there are certain similarities between data and oil. Both clearly are crucial inputs for the economy, powering all kinds of commerce, both directly and indirectly. And data, like oil, must be processed to create something of value. As Association of National Advertisers vice president Michael Palmer puts it, “Data is just like crude. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.” But beyond those sorts of vague generalities, the likeness between data and oil becomes strained to the point of credulity.
The first and perhaps most important difference is that oil, like other tangible goods, is rivalrous. When one party uses a barrel of oil, it is no longer available for anyone else. Data, on the other hand, is non-rivalrous: Multiple companies can collect, share, and use the same data simultaneously. That goes for consumers, too: When consumers “pay with data” to access a website, they still have the same amount of data after the transaction as before. Moreover, as big data strategist Paul Sonderegger of Oracle describes it, data is non-fungible; one piece of data cannot necessarily be substituted for another the way barrels of oil can.
Given these obvious differences, it would be foolish to use the “data is oil” analogy to extrapolate potential concerns about the collection and use of data. Yet some policymakers fall into that very trap. For example, many policy discussions about privacy are based on the false assumption that since personal data is valuable in the same way oil and other commodities are valuable, then strong privacy protections would help ensure individuals can capitalize on their data the same way a landowner benefits from owning a plot of land with oil under it.
Similarly, some policymakers see tech companies that have collected large stores of data as the new oil cartels, and they mistakenly fear “Big Data” is becoming the new “Big Oil.” But consumers benefit from the network effects and economies of scale that arise with these large companies. Unlike the oil industry, which like most industries has an upward sloping supply curve where marginal costs increase with higher production, data producing firms generally have marginal costs near zero, which drives down consumer prices as they get larger. Moreover, the amount of data a company has rarely insulates it from competition and rarely serves as a serious barrier to entry. That is because how a company uses data, rather than its mere possession of it, is what most often determines a company’s competitiveness. For example, MySpace had a several-year head start on Facebook when it came to collecting user data, but Facebook quickly surpassed MySpace in popularity because it created a better product.
In cases where companies restrict data access solely for the purposes of reducing competition, policymakers can and should intervene. For example, some airlines restrict third party access to their flight availability and pricing data. Doing so prevents services such as TripAdvisor or Hipmunk from allowing consumers to easily compare fares across multiple airlines. There is nothing inherently anti-competitive about having proprietary data; however, where there is no legitimate business justification to restrict access to data and doing so reduces competition and market transparency, a regulatory requirement to make data available would not inhibit business operations. On the other hand, if data were like oil, a requirement for an oil-rich company to share its oil with a third party would be absurd.
In fact, policymakers should recognize that sharing data is not a zero-sum game and businesses and consumers choose to share data because it is mutually beneficial. And with cloud computing, it is increasingly cheap and easy. For example, over the past three years, most major pharmaceutical companies have begun sharing historical clinical trial data with outside researchers, including competitors, rather than hoarding this information for competitive advantage. Researchers can use this data to accelerate drug development, better understand diseases, and design more efficient clinical trials.
Policymakers should take an interest in understanding how data is transforming the economy, but looking to oil as a historical example is not productive. Data and oil have fundamentally different economic factors, and misapplying the analogy could undercut our ability to use data for the benefit of all. The “data is the new oil” analogy was a useful way to explain the value of data, but as a reference for policymakers, this analogy is too crude.
Image: John Hill.