10 Bits: the Data News Hotlist
This week’s list of data news highlights covers May 20-26, 2017, and includes articles about Airbnb’s new approach to overcome the data science skills gap and a new bill to improve how the U.S. federal government manages geospatial data.
Hedge funds that use algorithms to trade stocks, known as quantitative hedge funds or “quants,” traded 27 percent of all stocks in the United States in 2016, which is a substantial increase from the 14 percent they traded in 2013. Over the past five years, quants have gained an average of 5.1 percent per year, compared to the 4.3 percent average increase of traditional hedge funds over the same period.
Airbnb has launched an internal employee training program called Data University to help overcome a shortage of trained data scientists and make its workforce more data literate. Data University offers courses ranging from basic data-science subjects, such as introduction to statistics, to advanced classes like machine learning, to encourage all of its workforce to participate, rather than just engineers. Since Airbnb launched Data University in the second half of 2016, the number of weekly users of its internal data science tools has increased by 15 percent.
DeepMind’s Go-playing AI system AlphaGo has beaten Ke Jie, the world’s highest-ranked Go player, by winning the first two games in a three game series. Given the complexity of Go, computer systems often failed to beat highly-ranked human players, yet over the past year, AlphaGo has consistently outperformed the world’s best Go players.
Researchers at the University of California, Berkeley, have developed a method for teaching a machine learning algorithm to work even when there is not a strong feedback signal, similar to how humans analyze things out of curiosity rather than for a practical purpose. Reinforcement learning—a type of machine learning—relies on providing systems with positive feedback for achieving certain goals, which gradually allows these algorithms to figure out how to solve problems, but is only effective with large amounts of training and strong feedback signals.
The Barcelona Supercomputing Center has partnered with Nvidia to launch a project called Alya Red which will develop a computational model of the human heart in real time. Developing high-resolution models of the heart is very resource intensive, as the electrical impulses that control the heart’s function take a large amount of computing power and time to simulate. Researchers working on Alya Red will use rendering technology from Nvidia to develop models that can be adjusted in real time, which could improve medical procedures, such as implanting a pacemaker.
Researchers at the University of California, Berkeley, have developed a system for training a robotic arm to grasp objects by using 3D modeling and machine learning. Rather than have the robotic arm attempt to grasp objects and learn from its mistakes, the researchers trained a neural network on a database of 3D models of shapes to have it determine how to grasp different kinds of objects without real-world practice. In real-world tests, when the system was 50 percent confident it could grasp an object, it was succesful 98 percent of the time. And when it was not confident, it would poke the object to figure out how to grasp it, making it effective 99 percent of the time.
Visual search company Blippar has developed an AI system that can analyze live video to identify the make and model of any car made by a U.S. car company since 2000. The system allows users to take a video of a car moving less than 15 miles per hour with their smartphone to get information about the car, including reviews and a 360-degree model of the car, in near-real time with 97.7 accuracy.
A bipartisan group of U.S. senators have introduced the Geospatial Data Act to coordinate the federal government’s efforts to collect and manage geospatial data. Many federal agencies collect geospatial data, though despite efforts to promote interagency data sharing, often agencies will perform redundant data collection or expenditures to procure data another agency already owns. The Geospatial Data Act would codify requirements from the Office of Management and Budget for agencies to adopt common standards for geospatial data to promote sharing and reduce redundant collection, and direct the Federal Geographic Data Committee to promote cost-effective data collection and exchange.
U.S. Citizenship and Immigration Services (USCIS) worked with Verizon and AI firm Next IT to develop a chatbot named Emma capable of speaking both English and Spanish to help guide users to useful information on the USCIS website. When people access the USCIS website, Emma will start live chatting with them to ask about their needs and guide them to relevant information.
Image: Buster Benson.