10 Bits: The Data News Hotlist

This week’s list of data news highlights covers February 7-13, 2015 and includes articles about the first Senate hearing on the Internet of Things and how a genomics database is being used to identify the source of foodborne illnesses.

1. Senate Holds First Hearing on the Internet of Things

The Senate Commerce, Science, and Transportation Committee held the first-ever hearing on the Internet of Things. Senators and witnesses discussed how lawmakers can build a regulatory environment for the Internet of Things that does not restrict innovation in this emerging field, which all agreed offered enormous economic benefits. Committee Chairman Sen. John Thune (R-SD) kicked off the hearing by arguing, “Let’s not stifle the Internet of Things before we and consumers have a chance to understand its real promise and implications.”

2. FDA Takes Lighter Approach to Health Data Devices

The Food and Drug Administration (FDA) released final rules about how it will regulate medical device data systems—hardware or software that transfers, stores, or displays data from medical devices—and mobile apps. The FDA will not be enforcing compliance requirements such as premarket review and post market reporting for such technologies so long as they do not modify data or control the function of a connected medical device.

3. U.S. Government to Release Indexes of Federal Data

The U.S. government will publish large indexes of federal agency data that has been previously unavailable to the public. These indexes, known as Enterprise Data Inventories, describe the data collected and stored by the government. President Obama’s 2013 open data executive order required federal agencies to build and maintain these indexes, but did not require they be made publicly available. Publishing this information will allow the public to get a better understanding of what data the government has collected but not published.

4. Fighting Pediatric Cancer with Better Data

The Dragon Master Foundation, a U.S.-based charity devoted to developing the technology needed to fight cancer, has partnered with five hospitals to accelerate research about pediatric cancer with data analytics. Though genomic data is increasingly used in cancer research, there has been a lack of progress in using this data, as well as demographic data, for certain types of childhood cancers. The partnership will focus on gathering more biological samples for study and will bolster the Dragon Master Foundation’s efforts to build a database of at least 50,000 human genomes for researchers working towards a cure for these diseases.

5. Helping Computers Understand Emotions from Pictures

A research team from the University of Rochester and Adobe Research has developed a process to train a “deep convolutional neural network”—a machine learning algorithm that can interpret sentiments from images. The researchers are attempting to make the process of sentiment analysis easier for computers, as people often express themselves online with images instead of just simply text, which is easier to analyze. Researchers used images from the photo-sharing website Flickr already labeled with sentiment tags to develop the algorithm, which they hope could eventually be used to perform large-scale sentiment analysis of social media data, which frequently includes mixed media.

6. Big Data Analytics in Virtual Reality

Internet of Things data analytics company Space-Time Insight is piloting a virtual reality project to make better sense of data. Using the virtual reality headset Oculus Rift, the company’s pilot lets users visualize the data coming from something like a faulty transformer with 3D models to help users address problems. Space-Time insight hopes it’s pilot will develop virtual reality as a platform to let users more easily interact with offsite connected devices and improve decision making.

7. Tracking Foodborne Disease with Genomics

A collaboration between the Food and Drug Administration and federal and state public health laboratories have developed the GenomeTrakr, a database of bacterial pathogen genomes, to help identify the sources of foodborne disease outbreaks. GenomeTrakr relies on technology that can sequence the complete genome of an organism at a single time and can identify unique features of pathogens much more accurately than previous methods.

8. Department of Energy Pilots Data Infrastructure Projects

The Department of Energy and researchers from the Lawrence Berkeley National Laboratory have developed a series of pilot projects to demonstrate the benefits of data infrastructure designed to facilitate specialized research. For example, one of the projects built a “data pipeline” to more easily share data between supercomputing sites. The goal of the pilots is to make better use of existing tools and develop new ones as needed for scientists to perform real-time analysis of their data.

9. Using Open Data to Explore California’s Parks

Data visualization company Stamen Design has developed CaliParks, an app that relies on open data to let users discover nearby parks, to help support California’s park system that is plagued with budget troubles. The app pulls data from government sources as well as images from Flickr and Instagram to let users find parks by location or interests, such as fishing or hiking. CaliParks creators expect the app to increase the popularity of the park system by connecting the social nature of parks, such as picture taking and group activities, with data about the options available to the public.

10. Involving More Women in Data Science

MassMutual Financial Group has created a Women in Data Science program at Smith and Mount Holyoke Colleges—women’s colleges in Massachusetts—to encourage women to enter the field. Funding from MassMutual will be used to hire faculty and develop a four-year data science curriculum that students at both colleges can participate in. Though the schools already offer some data science courses, classes like statistics are oversubscribed.

Image: Flickr user BagoGames.

10 Bits: The Data News Hotlist

Yelp’s Dataset Challenge

5 Q’s for Justin Stharsky, Managing Director at Synaptor

You may also like