This week’s list of data news highlights covers February 15-21 and includes articles on Wikipedia’s vandalism detection bots and an IBM initiative to deploy predictive analytics in the mining industry.
1. Wikipedia’s Anti-Troll Machines
Wikipedia uses various automated systems to detect and remove vandalism on its pages. The systems, which have grown from simple, rule-based text parsers, now include sophisticated machine learning algorithms trained on tens of thousands of past edits. Wikipedia editors also use human-aided software to detect more subtle vandalism, with programs that automatically organize recent edits but rely on a real-life editor to make the final call.
2. England Postpones Major Health Data Sharing Initiative
England’s National Health Service (NHS) has postponed the launch of its information sharing program, care.data, for six months, citing complaints from the medical community and the public that NHS has not provided enough info on information sharing, its goals, and its benefits. The program, which could help government officials track and improve care quality in hospitals, optimize treatment according to patient medical histories and fuel medical research, would have begun collecting medical records this April.
3. Study Maps and Categorizes Twitter Communities
A study released this week from Pew Research attempts to map and categorize large multi-user interactions on Twitter. The study’s authors found six fundamental types of Twitter community dynamics, from the polarized crowds that arise in political discussions to the hub-and-spoke structures that arise when many otherwise unconnected individuals follow a single news organization. The researchers also offer a number of provocative questions arising from their findings, including whether social media itself has helped make political discourse more polarized.
4. NBA to Test Wearable Sensors in Development League Games
The NBA will soon begin testing wearable devices that track player heart rate, position and other data during games in its Development League. Many teams in the NBA itself already use the devices in practice, but they are not yet allowed in games. Instead, the NBA will conduct the tests during Development League contests, with an eye toward eventually changing the rules in the NBA itself.
5. IBM Deploys Predictive Analytics in the Mining Industry
This week, IBM announced a new initiative to deploy predictive analytics in the mining industry, working with equipment companies to forecast failures and recommend repairs before any problems escalate and become more expensive. The company estimated that it could save a $30 billion mining company several billion dollars per year in productivity and other cost savings.
6. Artificial Intelligence as a Service
Expect Labs, a content recommendation software company, is offering its core artificial intelligence capabilities to the public with an application programming interface (API). The software includes speech recommendation and text analytics capabilities to recognize topics and recommend news articles, search results, and other relevant content. The API, which will allow developers to access the company’s software to create their own products, operates on a monthly subscription basis and includes a free option.
7. Connecticut to Launch Open Data Portal
Connecticut Governor Dannel Malloy signed an executive order this week to create an open data portal for his state and begin coordinating state agencies around open data releases. Data.ct.gov, which will be implemented by data platform provider Socrata, will serve as the central repository for the state’s public data sets, and will be managed by a newly-appointed chief data officer.
8. Biometric “Car Keys” Will Identify Users by Heartbeat
Biometric recognition startup Bionym plans to release a wristband later this year that will let an individual unlock a computer or car with nothing but his or her own unique heartbeat. Like fingerprints, no two people have the exact same heartbeat, so device locking mechanisms can be programmed to accept only specific users. The wristband’s creators hope that the technology can also be applied to other contexts, from personalized recommendations at restaurants to individualized environmental settings in a smart home.
9. California Bill to Restrict Educational Data Reuse
California State Senator Darrell Steinberg introduced a bill this week that would make it illegal for educational technology firms offering services for K-12 students to use the students’ personal information for any purpose other than the one that was originally intended. The bill also specifies a variety of circumstances, including instances when the service is no longer used for its original purpose, under which firms would have to delete a student’s data entirely.
10. Open Data Institute Expands
The UK-based Open Data Institute (ODI), a nonprofit that aims to expand open data use in the private sector, has announced five new international partner organizations that will serve as “nodes,” or hubs for coordinating local open data efforts. The new nodes, which are located in Osaka, Seoul, Sheffield, Philadelphia and Hawaii, join the 13 original international nodes announced in October 2013.
Photo: Flickr user NA.dir