The Center for Data Innovation spoke with César A. Hidalgo, an MIT Media Lab professor who helped lead a team that was contracted by the Brazilian government to create a visualization tool for the nation’s economic data. The tool, DataViva, launched formally in the Brazilian state of Minas Gerais today.
Travis Korte: For readers who may be unfamiliar, can you briefly introduce DataViva, and discuss what the DataViva team is hoping to change about the current state of economic data in Brazil?
César A. Hidalgo: DataViva is a data visualization engine that is opening up data for the entire formal sector of Brazil’s economy through more than 100 million interactive visualizations.
TK: DataViva’s eight apps lend themselves to a broad range of analysis. Did you take inspiration from other open data portals around the world, and, if so, which ones?
CH: We have always been visualization enthusiasts, so we do draw inspiration implicitly and explicitly from other people’s work (Mike Bostock certainly being one of them). [Ed. note: Bostock, the graphics editor of the New York Times, created the popular Javascript visualization library d3, upon which DataViva’s front end is based]. The visualizations in DataViva, however, were customized and adapted to the specific data that DataViva is delivering. The visualizations are split into three main categories: descriptive apps (treemap, stacked, geomap), that can help you understand the present level of development of an industry or a location; predictive apps (networks, ring, scatter), which show the possibilities implied by what is currently there, and prescriptive apps (occugrid), which can help you understand what are the occupations missing for someone wanting to develop an industry in a given location (i.e. manufacture aircraft in Santa Rita do Sapucai).
TK: Brazil has been somewhat proactive about open data in the past, but a recent Open Knowledge Foundation (OKFN) analysis found that federal geospatial offerings in that country leave a lot to be desired. Did your team experience this when building the tool, and how did you contend with it?
CH: In my experience, working with data from the Brazilian government has been a breeze. I have no comments on the OKFN analysis, since I am not aware of the methodology they used to draw this conclusion. What I can say is that, now that DataViva is out, Brazil has the best open data site in the planet, hands-down. If this is not the case, I want the URL of the counterexample.
TK: You’ve open-sourced the code behind DataViva. Do you foresee other countries using the tool (maybe even with non-economic data) in the future, or was this project strictly Brazil-specific?
CH: By open-sourcing the code we expect people from many different domains of life to use these apps to satisfy their visualization needs. Some of these people might be governments looking to visualize health data, others might be people looking to visualize the statistics of their favorite sport teams. Either way, the code will be useful to them. That’s the beauty of open source and the creation of versatile visualization tools.
TK: A lot of governments want to see proof of the economic value of open data before they undertake big releases. Have you heard about any Brazilian groups who might want to use DataViva in a business context? Or, if not, what commercial applications do you see benefitting from the tool?
CH: The cost of DataViva was 0.5 cents of a Brazilian real, or 0.2 cents of a dollar per visualization. People can compare that cost with that of the PowerPoints they buy from the same old consultants we all know about. In my experience, we have found DataViva to reduce the time it takes to create a presentation regarding a location or industry from months to days. This makes the tool extremely attractive for people screening investment locations. I have shown the tool to people in large multinational companies—that I will not mention—and that have activities in Brazil or are looking invest. In all cases, they have been blown away by how easy it is for them to explore the industrial structure of locations, the occupations available, and the salaries that these occupations command. Going forward, I believe that tools like DataViva will become an essential components for investment decisions.