10 Bits: the Data News Hotlist
This week’s list of data news highlights covers March 31 – April 6, 2018, and includes articles about how Australia is using AI to help prevent shark attacks and a predictive analytics tool that can track bird migration.
The Estonian government has launched an initiative to provide 100,000 residents with free genetic sequencing and personalized information about their health risks. Estonia will pool participants’ genetic data in a national database to facilitate research and send information about participants’ genomes, such as genetic predisposition to certain diseases, to their doctors who can educate participants about what this information means. Estonia will eventually offer this service to all of its 1.3 million residents.
Australia is experimenting with a variety of shark surveillance systems that use AI to identify and track sharks near beaches to prevent shark attacks. One system relies on drones that patrol dozens of beaches and uses cameras and AI to identify swimming hazards and can differentiate between swimmers, sharks, boats, and other objects in real time. Another system, called Clever Buoy, uses underwater sonar arrays and an algorithm to identify and track marine animals. Clever Buoy can differentiate between different animal species and automatically text lifeguards with GPS coordinates when it detects a shark.
Data scientist Kenji Doi has developed a machine learning system that can analyze pictures of bowls of ramen and identify which of 41 different ramen shops made it with 95 percent accuracy. Doi created a dataset of 48,000 pictures of ramen bowls from 41 different Ramen Jiro restaurants, a popular ramen franchise in Japan which serves a standardized menu, and trained machine learning models on it with a tool from Google called AutoML. Because the restaurants often had the same kinds of bowls and tables, Doi believes his system was able to identify each dish’s restaurant of origin based on subtle cues such as the arrangement of toppings or the cut of meat used.
Researchers at Massachusetts General Hospital, the Massachusetts Institute of Technology, and the University of Michigan have developed a machine learning system that can analyze electronic health records to predict the risk patients will develop the bacterial infection C. difficile. C. difficile kills 30,000 people in the United States every year and is resistant to many antibiotics, making early detection crucial for treatment. The researchers trained their system on medical records from 257,000 patients at two different hospitals to develop specialized models to predict the risk of infection based on risk factors unique to each hospital.
Astronomers at the Massachusetts Institute of Technology have developed a machine learning system that can analyze telescopic images from NASA and identify debris disks, which are clouds of debris that orbit stars and are good indicators that an exoplanet is present. NASA had has made large amounts of telescoping data publicly available in an effort to crowdsource the analysis of debris disks to identify new planets, but it would be impractical to rely on humans to sift through such large amounts of data. The researchers trained their system on images that humans had already identified as containing debris disks and then applied it to new images, discovering 367 previously unidentified celestial objects.
Stanford University researchers have used an analytics technique called word embeddings, which maps relationships between words, to track how gender and racial stereotypes have changed in the United States over the last 100 years. The researchers used a machine learning algorithm to analyze U.S. books, newspapers, and other documents from throughout the last century and correlate linguistic changes to demographic shifts. By applying this algorithm to 200 million words, collected from documents published between 1910 and the present, the researchers found that words related to competence, such as “resourceful,” were slowly losing their masculine connotation, while words describing physical appearance remained strongly associated with femininity.
Researchers at the Massachusetts Institute of Technology have developed a headset that can sense neuromuscular activity in a wearer’s jaw and face and use machine learning to translate these signals into words without a wearer having to vocalize them. Internal verbalization, also known as subvocalization, is when a person makes subtle movements in their facial muscles as if they were speaking but without actually talking. The researchers trained their system to match vibration patterns from subvocalization to words, giving it a transcription accuracy of 92 percent.
Researchers at the University of Oxford and Cornell University have developed a predictive analytics tool called BirdCast that can forecast migratory patterns of birds up to three days in advance, which could allow wind turbine operators to take precautionary measures to avoid killing birds. The researchers built a model using historical weather radar data, which can detect the density of birds in a particular area, from 143 sites across the United States for every night from 1995 to 2017, allowing them to create the first ever estimate of birds migrating in the country at any one time. Then, the researchers incorporated weather data into the model, allowing it to learn how differences in air temperature, wind, and other factors would influence migration patterns.
Researchers at Plymouth University have developed an AI system to analyze atmospheric data about a planet and classify them based on how well they resemble Mars, Venus, Saturn’s moon Titan, early Earth, or present-day Earth, which are the most likely bodies to be habitable in the solar system. This system could help guide future space exploration efforts by identifying planets with a high likelihood of supporting life.
A startup called Seqster has developed a health data platform that allows people with Alzheimer’s disease and their families to link their electronic health records, genomic data, and data from wearable devices. The platform allows users doctors’ to better monitor symptoms as well as compiles comprehensive data about the disease’s progression that patients can share with researchers.
Image: Tommy Hansen.