10 Bits: The Data News Hotlist
This week’s list of data news highlights covers January 31 – February 6, 2015 and includes articles about a using Twitter posts to predict the stock market and a whole city being transformed into an Internet of Things experiment.
Indiana is using data analytics to help combat its high infant mortality rate. Indiana Governor Mike Pence has launched an initiative to improve services with a new data analytics infrastructure called the Management and Performance Hub which allows for centralized data sharing and analysis by the state government. The project examined 17 datasets from five different state agencies and four public sources containing information such as financial status, criminal history, and infant health to identify risk factors for infant mortality. It found that though younger mothers were typically likely to have better outcomes, they are considerably less likely to receive Medicaid and thus not receive appropriate prenatal care.
The Smithsonian American Art Museum (SAAM), leading a group of 14 institutions across the country, has received funding to create a shared online database to improve research about American art. SAAM, which was among the first museums to make its entire collection available online through, received grant funding to expand its system to other museums around the country. Other institutions involved in the project include the National Portrait Gallery, the Princeton University Art Museum, and the Archive of American Art.
The Centers for Medicare and Medicaid Services (CMS) will be now be annually publishing records of the payments made to physicians through Medicaid, which amounts to tens of billions of dollars per year. This announcement comes after CMS released a year’s worth of this data last April to help combat fraudulent and wasteful payments made to physicians. The data is expected to help investigators, consumer groups, health care providers, and journalists improve efficiencies and identify bad actors abusing the system.
4d Healthware, a Chicago based startup and member of MATTER, the city’s health technology incubator, uses predictive analytics to connect patients to personalized, preventative treatments for diseases. By packaging patient medical data and factors like where patients live and their body fat percentage into an accessible online platform, 4D Healthware hopes to deliver individualized, actionable recommendations to patients to improve their health. Conceived by former IBM executive Star Cunningham, the startup is currently recruiting healthcare practitioners to help develop the service.
On February 3, the city of Bristol, England approved a collaboration between the city and the University of Bristol to create an experimental citywide network of sensors for technology companies and research groups to develop and test new programs devoted to improving quality of life. The project, called Bristol Is Open, will install sensors around the city to collect data on things like air quality, traffic patterns, and temperature and will make this data available to interested parties who think they can make this data useful.
The Federal Election Commission (FEC) has published a downloadable dataset of all the expenditures made by political action committees, political parties, and candidate committees trying to influence an election. The data includes staff salaries, ad buys, and other operating expenditures that have been traditionally difficult for the public to obtain in bulk. The FEC released this data at the request of researchers and other interested parties who have long tried to access this data.
A startup called Clarifai has developed a deep learning program, a type of machine learning that attempts to interpret abstract data, that can quickly recognize 10,000 types of objects and scenes in videos. The software is capable of analyzing a video faster than a human could even watch it—a three and a half minute clip can be processed in just 10 seconds. Clarifai hopes to apply this technology to do things like match online ads to content or develop new methods of organizing and editing videos.
Investors are starting to incorporate traditionally hard to quantify “soft data”, like social media data, into their assessment of the stock market. By combining this information with hard data like share price and financial assessments of companies, investors are hoping to gain new insights into how the stock market will act. Social media signals were difficult to quantify until only recently, when technologies like sentiment analysis began to allow analysts to predict stock price trends based on the things people write on Twitter.
Artificial intelligence researcher and computer scientist Randal Olson used machine learning to develop the optimal search strategy for the Where’s Waldo? book series. By analyzing the 68 locations of Waldo across seven Where’s Waldo? books, Olson identified location trends and developed an algorithm to indicate the quickest search path to finding Waldo on a page. One interesting finding from the analysis is that Waldo almost never appears in the top left corner or the edges of pages and never appears at the bottom right of a page.
Under the federal government budget request submitted to Congress by President Obama, the Office of the National Coordinator for Health Information Technology (ONC) would receive $92 million in funding—a $32 million increase from the past two years. The main focus of this increased budget is the improvement of health IT interoperability to increase quality of care. Specifically, the budget would improve electronic health records, invest in developing new interoperability standards, and provide IT support for Obama’s recently proposed precision medicine initiative. The budget also aims to expand Medicare data sharing to increase transparency and improve quality of care in the Medicare program.
Image: Flickr user William Murphy.