This week’s list of data news highlights covers November 16-22 and includes articles on a system to algorithmically generate novel recipes and a wearable computing prototype that tracks users’ sitting habits.
1. Algorithmic Chef Creates Novel Recipes
A team of IBM researchers has developed an algorithm to produce novel food recipes. Users define a key ingredient, a type of dish and a regional cuisine style, and the program outputs dishes it determines are the most likely to be tasty. Trained on a dataset of millions of existing recipes, as well as information on the molecular compatibility of different ingredients, the algorithm ranks its outputs based on novelty and chemical content, which acts as a proxy for palatability.
2. Diagnosis Controversy Points to Poor Underlying Data
An online calculator that doctors use to make cholesterol medication prescription decisions may be overestimating people’s risk of heart attack and strokes, resulting in more people taking certain types of drugs. The drugs, known as statins, have been effective in treating high-risk patients, such as those who have already had a heart attack, but prescribing the drugs for purely preventative purposes has been somewhat controversial. Some of the calculator’s inaccuracies may stem from a lack of long-term data on patients who use statins, in part due to the lack of interoperable electronic health record systems in the United States.
3. Algorithms to Predict Atrocities
The U.S. Agency for International Development (USAID) and the nonprofit Humanity United awarded the top prize in their recent “Tech Challenge for Atrocity Prevention” to an algorithm that takes in geographic data, sociopolitical and historical information and predicts when and where human rights violations might occur. The model beat out nearly 618 others submitted for the challenge. USAID and Humanity United hope the winning algorithms will be used by human rights organizations and governments to identify high-risk regions.
H.R. 2061, the Digital Accountability and Transparency Act (DATA Act), passed the U.S. House of Representatives this week. The Act, sponsored by Rep. Darrell Issa (R-CA), would standardize and publish a variety of U.S. government financial data. A companion bill awaits consideration in the Senate.
5. Cloud Accelerates Science Data Processing
Cloud computing startup Cycle Computing recently helped a University of Southern California researcher analyze over 200,000 chemical compounds to find candidates for new solar energy materials. Instead of taking hundreds of years, which is how long such a task would have taken on a single modern processor, massively parallel computing helped finish the job in only 18 hours. Cycle Computing operates around 17,000 cloud-based processors, and its CEO, Jason Stowe, predicts that such infrastructure will greatly increase the pace of science and innovation in the future.
6. Wearable Tech Helps Users Rise
Rise, a wearable sensor that tracks when its wearer sits down, promises to help health-conscious users change their working habits. Rise will prompt frequent sitters to take a stretch break and walk around, and will allow users to track their progress and compete with coworkers. The product, which has been prototyped and awaits funding through an Indiegogo campaign, is expected to retail for around $40.
7. Visualizing Interstate Migration
A visualization of state-to-state migration based on data from the American Community Survey was released this week. The visualization communicates the complexities of interstate migration using wider links to represent larger migrant populations and colors to represent U.S. regions. It also shows that a large number of individuals migrate from New York to Florida (retirees, creator Chris Walker speculates) and that more people left California in 2012 than entered (possibly due to the state’s increasing cost of living).
The “Big Data Art 2013” show in Munich, Germany, challenged artists to create works around the theme of data and analysis. The show, which includes electronic displays, paintings and music, engages with aesthetic as well as political issues related to the world’s increased use of data.
9. Network Mines Facebook Data To Boost Ratings
The Bravo TV network is using Facebook data to glean insights from fan discussions about its shows. The network began using one of Facebook’s application programming interfaces (APIs) to learn granular demographic information about its “Real Housewives” series last month. Network representatives have expressed hopes that Facebook will eventually offer sentiment analysis scores through the API.
10. A Step Forward for Brazilian Open Data
The Brazilian state of Minas Gerais has released an online data visualization tool called DataViva. The tool is focused on economic data, including information on workers, industries and exports. The DataViva team hopes the tool, which was developed in collaboration with professors from Harvard and MIT, will contribute both to policy discussions and academic research.