On May 19, 2016, hundreds of members of Congress booed, yelled, and chanted “shame” on the House floor in response to a handful of representatives changing their votes from “yea” to “nay” after the voting time expired on an amendment barring federal contractors from discriminating against LGBT workers, causing it to fail despite initially achieving the necessary 213 “yea” votes to pass. While lawmakers are allowed to change their vote, they did not follow the normal procedures. This means that even though all votes were recorded with an electronic voting system, there is no official record of which members of Congress changed their votes that day, causing many in Congress to lament the lack of accountability in the voting process. This controversy, said Steve Dwyer, senior policy advisor and digital director for House Democratic Whip Steny Hoyer (D-MD), at the Center for Data Innovation’s recent event, “What Happens When Laws Become Open Data?,” could easily have been avoided simply by publishing live vote counts as open data. At the event, Dwyer and a panel of legislative data experts explored the economic and social impact of open legislative data, and identified a number of ways that it could help address some of the American public’s biggest frustrations with Congress by improving accountability, introducing transparency into the legislative process, and making Congress more productive.
Legislative data—information about legislative activities, including bills and their status, lawmaker votes, committee meetings, public communications by members of Congress, lobbying information, and the products of legislative support agencies such as the Congressional Research Service—is often not available to the public in a usable capacity (i.e. the data is delayed, not provided from an official source, or is not in a machine-readable format), if it is available at all. Fortunately, in recent years a number of actions by ambitious civic hackers and Congress have substantially increased the amount of timely and machine-readable legislative data freely available online. For example, as Daniel Schuman, policy director of the nonprofit Demand Progress pointed out, thanks to the pioneering efforts of civic hacker Josh Tauberer, members of the public have been able to track legislative developments though GovTrack, an online information portal about Congressional activities. Because of his and other similar efforts, the Library of Congress will retire the dated congressional information portal THOMAS on July 5th, 2016 and formally transition to Congress.gov, which draws heavy inspiration from GovTrack.us to make legislative data more accessible to the public.
Despite this and other promising developments, such as the Government Publishing Office in February 2016 making bill status information available in machine-readable formats and recent House and Senate bills to make Congressional Research Service reports publicly available, there are still a large number of opportunities to improve open legislative data practices. Jessica Seale, digital director for Senator John Cornyn (R-TX), described how Senate procedure for publishing pending amendments often entails manually scanning the text, sometimes with the bill name obscured by a sticky note, and publishing them as PDFs online. By publishing these in non-machine readable formats, it can be exceedingly difficult for the public and legislative staff alike to access and understand this data. Tim Hwang, chief executive officer of government data analytics firm FiscalNote, noted that advancements in computer vision technologies and scraping and extraction algorithms have enabled workarounds for this challenge, but developing these technologies requires significant continued investment. Will Matthews, head of global data at Bloomberg Government, agreed, pointing out how though this approach allows Bloomberg to develop powerful insights into the legislative process and track the influence outside parties have on policy, it only benefits organizations that can afford to pay for these services and not the public as a whole.
In addition to promoting accountability and transparency, better open legislative data policies could help Congress fight its reputation of being notoriously unproductive by allowing congressional staff to spend less time tracking down, managing, and analyzing bad and difficult-to-use data, and instead focus their efforts on more important tasks. In the House, Dwyer described how he helped developed an intranet to facilitate data sharing amongst thousands of Democratic staff 10 years ago, the Library of Congress initially was incapable of providing them with legislative data, causing them instead to turn to less-timely and unreliable data provided by civic hackers and scraping algorithms. Seale elaborated, describing how the intranets for both parties in the House and Senate still rely primarily on scraping algorithms, and that even basic information about upcoming votes is not made available to members of Congress in machine-readable formats, making tools that pull this data problematic. Though these intranets are not public, these data usability challenges result from the same data management practices that affect all legislative data, and by improving these practices to make legislative data more open, public and private applications of this data both stand to benefit. For example, after the Congress.gov transition, Dwyer says the House Democratic intranet will be able to pull this data in machine-readable formats directly from the official source at the Library of Congress.
Going forward, panelists agreed that Congress, as well as state governments, should strive to make legislative data more accessible and usable, particularly by improving the early stages of the legislative data lifecycle. For example, Hwang and Seale noted that in the Senate as well as in some state legislatures, lawmakers can submit handwritten notes and bills rather than use machine-readable electronic formats, and Dwyer pointed out how hearing transcripts often take months to be made available due to the long time it takes to transcribe these meetings, despite the increasing availability of transcription software that could publish machine-readable recordings of hearings in near-real time. Additionally, Congress should focus on identifying new sources of legislative data that are not readily available to the public in useful formats, but have no good reason not to be, such as Congressional Research Service reports.