10 Bits: The Data News Hot List
This week’s list of data news highlights covers July 19-25 and includes articles about the City of Chicago’s efforts to use data to combat its rat problem and a facial recognition algorithm that has bested human performance for the first time ever.
The City of Chicago is using analytics to combat its rat problem, having identified 31 variables that can predict where rats will gather. The city hopes to create models that will forecast rat activity, allowing cleanup crews to anticipate problems before residents begin issuing complaints. The city’s chief information officer said she hopes the system will be useful in neighborhoods with high crime rates, where residents tend to underreport rodent problems.
Interactive job board company Glassdoor is adding new tools to its national jobs database to map where different kinds of jobs are most prevalent and recommend areas with the greatest opportunity for those willing to move. Glassdoor’s updated web interface will also allow users to simultaneously search for jobs for their spouses and display areas with the greatest overlap in opportunity between the two job types. Users will also be able to look up the work trajectories of other people in a given field to see what other jobs or industries similar users have transitioned into. The new features are part of a multi-stakeholder effort led by Vice President Joe Biden to curb unemployment in the United States.
Simpsons World, a planned online archive of all 552 episodes of the venerable TV show, will allow fans to explore the series as a database, sorting episodes by themes, topics, locations, and characters, or finding all clips that contain a given set of characteristics. For example, users could find all the scenes from the show’s first 10 seasons in which Homer Simpson and his friend Apu had conversations in places other than the Kwik-E-Mart, Apu’s workplace. The archive will also offer a graph of every episode’s air date and popularity.
Researchers at the Chinese University of Hong Kong have developed a facial recognition algorithm they say can outperform humans for the first time. Humans can determine whether two images show the same person about 97.5 percent of the time, while the algorithm can achieve an accuracy of about 98.5 percent. The algorithm works by identifying where the eyes, nose, and corners of the mouth are positioned in each photo and using machine learning to determine whether the features’ positions in the two photos are similar enough to be called a match.
Yelp has released Yelp Trends, a data visualization tool that tracks how often a given word or phrase shows up in the site’s 57 million reviews, broken down by city. For example, while New York was the first city to see a trend for the word “cronut,” Los Angeles was the first to embrace the “food truck” craze. The data goes back to 2006 and the site comes with a number of example trends to spur users’ imagination.
A variety of data-driven services have cropped up to help prospective homebuyers narrow down their options to a few locations that fit their specifications. These websites pull data from the U.S. Census, the Department of Education, and other sources to recommend neighborhoods that have a given set of characteristics. The chief executive of one real estate data services company notes that these services can be less biased than traditional real estate firms, which may be incentivized to make recommendations for less desirable neighborhoods.
San Francisco has released a strategic plan for its open data over the next three years, which ranges from improving data quality to allowing individuals to get access to their own data. The city also plans to improve data sharing between departments through a shared set of rules that classify data as public, confidential, or something in between. The plan follows the city’s appointment of its first chief data officer this spring.
Home improvement stores such as Home Depot and Lowe’s have begun selling Internet of Things products to enable do-it-yourself customers to build their own smart homes. The devices, which include connected window blinds, water heaters, light bulbs, and other products, are connected through proprietary apps, such as Home Depot’s Wink and Lowe’s Iris, but are also available a la carte, to ensure that customers can thoroughly customize their connected homes. For example, customers can use the home automation technology as part of a home security system to receive alerts when there might be an intruder in the house.
Music data startup Senzari released a service called MusicGraph.ai this week, which gives users access to machine learning and analytics technologies with which they can analyze the company’s billion-item database of music information. The product, which will be marketed to record labels and streaming music services such as Pandora, integrates social data from Facebook and Last.fm to help customers discover which songs appeal to which demographics or how lyrical content can influence a song’s chart performance. The company hopes its cloud-based service will make large-scale music analytics cheaper and more accessible to smaller outfits.
This week, Washington, D.C. Mayor Vincent Gray issued an executive order on open data and announced that the city will soon launch an online portal for submitting Freedom of Information Act requests. The portal will be powered by FOIAXpress, the same request processing software used by the General Services Administration and the departments of Justice and Homeland Security, and will include 50 city agencies when it launches. The executive order requires each agency to proactively make its data available in open, machine-readable formats.
Photo: Flickr user Peter Kemmer