10 Bits: The Data News Hot List
This week’s list of data news highlights covers July 26-August 1 and includes articles about a new House bill to advance clinical data registries and genomic data analysis efforts from Google and startup Human Longevity.
Genomics startup Human Longevity wants to develop computational tools to turn genomic data into insights, and it has hired former Google Translate head Franz Och as its chief data scientist. The startup was launched by Craig Venter, best known for his pioneering work sequencing the human genome, and will mine data from tens of thousands of genomes annually to develop personalized treatments, early diagnosis tools, and other products to combat diseases associated with aging. Venter has stated that the company may sell these insights and data to pharmaceutical companies.
IBM has partnered with the United Services Automobile Association (USAA) to create the IBM Watson Engagement Advisor, which will use IBM’s artificial intelligence platform to help people transition from military to civilian life. The system, which analyzes USAA’s thousands of documents related to financial and health care benefits, can answer complex questions about how individual users can best take advantage of different benefits. The Engagement Advisor is currently available to USAA members on its website. Users are being asked to give feedback on Watson’s performance, which will be incorporated into the system’s machine learning algorithms to improve its performance.
Last year, Bay Area-based data analytics company Palantir donated data it had collected from social networks and other sources to the Carter Center, a human rights nonprofit, to help the center find relationships and track funding among major players in the ongoing Syrian conflict. YouTube videos, which comprise the majority of the data, allow analysts to track events and document nearly 6,000 armed groups with around 100,000 fighters. The nonprofit has also created visualizations depicting the hierarchy of different opposition groups. The Center hopes to use the data to identify central players in the conflict and verify Syrian military leaders’ claims about their power and influence.
This week, the House Energy and Commerce Committee voted to advance a bill to the House floor requiring the Department of Health and Human Services (HHS) to publish recommendations for developing clinical data registries. The bill directs HHS to evaluate how clinical data registries can be used to evaluate the impact of different care models and ultimately improve patient health outcomes. It would also require HHS to consult with various condition-specific medical groups to determine how to best use clinical data registries to inform physicians of new treatments and best practices.
Google is enrolling volunteers in a research project to collect genetic and molecular data in hopes of establishing a granular picture of how a healthy human body should function. The project, which will start with 175 subjects and hopes eventually to expand to thousands more, will use the company’s data mining capacity to find biomarkers that researchers could eventually use to create tests for early disease detection. Once Google de-identifies the data, it plans to release the information to other researchers.
In recent years, banks have increasingly turned to data analytics to combat money laundering and other financial crime, but many banks have found that even this is not enough to stop increasingly sophisticated criminals. In response, some banks have deployed sophisticated predictive models and machine learning to flag potential criminal activity where information is sparse. One type of model banks are using is called a Bayesian Belief Network, which adaptively refines its predictions of risk based not only on potentially fraudulent activity but also contextual factors, such as account type and customer demographic.
The Graduate Center at the City University of New York (CUNY) is launching a Center for Digital Scholarship and Data Visualization in Midtown Manhattan. The Graduate Center has been at the forefront of image data visualization for the past several years, with projects examining trends in selfie photos across the world and examining visual differences in Instagram photos in various major cities. CUNY plans to partner with other local institutions, such as the Museum of Modern Art and the New York Public Library to access new data streams.
The National Football League will begin tagging players’ equipment with RFID chips this season to give coaches and fans real-time statistics on players’ position, speed, and distance travelled. The data will be captured from all 32 teams and loaded into a league-wide database. The league has not announced what sorts of analytics it hopes to conduct using the data beyond simply making it available.
WIFIRE, a collaborative project from the University of California San Diego and the University of Maryland, uses remote sensing to forecast how quickly wildfires might spread. The project, which combines simulation, prediction, and visualization, mines satellite imagery using machine learning algorithms to identify the most dangerous wildfires before they get worse. WIFIRE’s creators hope it will help firefighters make better decisions in real-time when it is fully developed.
A Kickstarter campaign to build a connected sleep tracking kit, known as Sense, has raised $1.5 million as of August 1. The kit comes with sensors to monitor bedroom conditions, a pillow clip, and an accompanying mobile phone app that aims to wake the user up at the optimal point in their sleep cycle. Sense was created by Hello, a San Francisco-based company launched by 22-year-old Thiel Fellowship recipient James Proud.