This week’s list of data news highlights covers April 15-21, 2017, and includes articles about a new tool for analyzing government spending and a new method for treating addiction with smartphone data.
1. Finding Mass Graves with Machine Learning
A team of of researchers from the Ibero-American University in Mexico City, the Mexican nonprofit Data Cívica, and the San Francisco-based Human Rights Data Analysis Group are using machine learning to predict likely locations of mass graves in Mexico, which are the result of the high rates of drug-related violence in the country. The researchers scrape news reports for mentions of hidden graves to populate a database, analyze geographic and demographic data to build profiles for each of Mexico’s 2457 counties, and use machine learning to predict which counties are most likely to have mass graves. Using historical data from 2013, their machine learning model was able to predict counties where mass graves were found in 2014 with 100 percent accuracy.
2. Making Government Financial Data More Accessible
Former Microsoft chief executive Steve Ballmer has launched a freely accessible database called USAFacts that aggregates 30 years of data about local, state, and federal government finances and presents it in an easy-to-interpret format. USAFacts combines spending data from over 70 different government agencies, including data that was already open, as well as scraped data from PDFs and other non-machine-readable formats, and generates straightforward, interactive data visualizations that allow users to understand data in context, access related data, and view changes in government finances over time.
3. Analyzing Smartphone Data to Fight Addiction
Chicago-based startup Triggr Health has developed a smartphone app to help people battle addiction by analyzing users’ smartphone use data to predict relapses and prompt interventions. The Triggr Health app gathers data such as a phone’s sleep history, location, texting patterns, and other factors and combines this with data users provide about their drug use history. Then using machine learning, the app identifies patterns indicating a user is at a high risk of relapsing, and enables Triggr Health staff to contact a user’s care team, such as staff at a rehabilitation facility. The app could make drug addiction treatment much more affordable, as a 30-day inpatient treatment program can cost as much as $17,000.
4. Diagnosing Cystic Fibrosis from Sweat
Researchers at Stanford University and the University of California, Berkeley, have developed a wearable sensor that can analyze a wearer’s sweat to diagnose cystic fibrosis. People with cystic fibrosis have altered levels of sodium and chloride levels in their sweat, and traditional diagnostic methods are time consuming and require patients to visit specialized facilities. The sensor adheres to a patient’s skin, analyzes the molecular content of their sweat, and transmits this data via Bluetooth to a smartphone, allowing patients to be diagnosed without having to visit specialized facilities and sit still for hours. The sensor can also measure glucose levels in sweat, which could help monitor patients with or at risk of diabetes.
5. Turning Gamers into Citizen Scientists
An organization that pairs video game makers with scientists called Massively Multiplayer Online Science (MMOS) has partnered with the developers of the online space video game EVE Online to encourage players to sort through satellite imagery to help discover exoplanets. Inside the game, players will be given access to 167,000 light curve images collected by the European Space Agency and encouraged to flag anything interesting. Once an object is flagged five times, researchers at the University of Geneva will investigate it further to determine whether or not an object is a known exoplanet. Though algorithms can analyze satellite imagery to an extent, light curve images are complicated, requiring human analysis to interpret.
6. Making Self-Driving Technology Open Source
Baidu has launched a new platform Project Apollo that will make the the software it has developed for its self-driving car project freely available as open source. By making self-driving algorithms, such as those involved in route planning and detecting obstacles, publicly available, Baidu hopes to accelerate the development of the autonomous vehicle industry as a whole.
7. Teaching Machines to Learn like Brains
The U.S. Defense Advanced Research Projects Agency (DARPA) has launched a four-year program called Lifelong Learning Machines (L2M) to develop AI systems capable of continuously improving with experience and applying what they know to new situations in the same way that biological systems can. While machine learning systems can learn from new data, it is difficult, if not impossible, for them to generalize what they learn and use it to solve different kinds of problems.
8. Collecting a Huge Amount of Health Data to Predict Diseases
Alphabet’s life sciences subsidiary Verily has launched an initiative called Project Baseline that will collect massive amounts of health data from 10,000 volunteers over four years to identify predictors of heart disease and cancer. Participants will contribute data from a wearable device that tracks their pulse and movements, a sensor to place under their mattresses that tracks sleep patterns, and a wide variety of medical tests, such as heart scans, genome sequencing, and analysis of bodily fluids. Researchers at Verily and Duke University will analyze this data to identify disease predictors and then, after two years, make this data available to other researchers.
9. Building Better Sensors for Self-Driving Cars
Silicon Valley-based Velodyne has developed a solid-state LIDAR sensor for self-driving cars that can be produced more easily and for less money than traditional LIDAR systems, potentially lowering the high costs involved in developing self-driving systems. Normally, LIDAR systems involve an array of sensors and lasers that constantly rotate to map the nearby environments in 360-degrees, however these systems are difficult to mass produce because they require extensive calibration and cost upwards of $10,000. Velodyne’s new solid-state LIDAR sensors, which are stationary, only have a 120-degree horizontal field of view, but are just a few square inches in volume, more rugged than normal LIDAR systems, and can be produced for under $1,000.
10. Boosting Hawaii’s Geospatial Data
Hawaii’s Office of Planning and Office of Enterprise Technology Services have partnered with Esri to develop a portal that will visualize the state’s open geospatial data. Hawaii already publishes this information, but the datasets alone are challenging for the average person to interpret. The new portal provides maps and data visualizations, along with the raw data, to make it more accessible, and users can sort through the data based on different categories, such as infrastructure and historic maps.
Image: CCP Games.