3 June 2013

How accessible are online legislative data archives to political scientists?

Despite the technical advances in Internet archival systems, websites commonly lack comprehensive and systematic data collection and retrieval processes. David Leal, University of Texas at Austin, discusses these issues in his recently-published study in Policy and Internet, co-authored with Taofang Huang, B.J. Lee, and Jill Strube, Assessing the Online Legislative Resources of the American States. It addresses the potential and problems that can be encountered when using U.S. state online legislative resources for political science research, and presents a tool for evaluating the feasibility of research projects.
House chamber of the Utah State Legislature

A view inside the House chamber of the Utah State Legislature. Image by deltaMike.

Public demands for transparency in the political process have long been a central feature of American democracy, and recent technological improvements have considerably facilitated the ability of state governments to respond to such public pressures. With online legislative archives, state legislatures can make available a large number of public documents. In addition to meeting the demands of interest groups, activists, and the public at large, these websites enable researchers to conduct single-state studies, cross-state comparisons, and longitudinal analysis.

While online legislative archives are, in theory, rich sources of information that save researchers valuable time as they gather data across the states, in practice, government agencies are rarely completely transparent, often do not provide clear instructions for accessing the information they store, seldom use standardized norms, and can overlook user needs. These obstacles to state politics research are longstanding: Malcolm Jewell noted almost three decades ago the need for “a much more comprehensive and systematic collection and analysis of comparative state political data.” While the growing availability of online legislative resources helps to address the first problem of collection, the limitations of search and retrieval functions remind us that the latter remains a challenge.

The fifty state legislative websites are quite different; few of them are intuitive or adequately transparent, and there is no standardized or systematic process to retrieve data. For many states, it is not possible to identify issue-specific bills that are introduced and/or passed during a specific period of time, let alone the sponsors or committees, without reading the full text of each bill. For researchers who are interested in certain time periods, policy areas, committees, or sponsors, the inability to set filters or immediately see relevant results limits their ability to efficiently collect data.

Frustrated by the obstacles we faced in undertaking a study of state-level immigration legislation before and after September 11, 2001, we decided to instead  evaluate each state legislative website — a “state of the states” analysis — to help scholars who need to understand the limitations of the online legislative resources they may want to use. We evaluated three main dimensions on an eleven-point scale: (1) the number of searchable years; (2) the keyword search filters; and (3) the information available on the immediate results pages. The number of searchable sessions is crucial for researchers interested in longitudinal studies, before/after comparisons, other time-related analyses, and the activity of specific legislators across multiple years. The “search interface” helps researchers to define, filter, and narrow the scope of the bills—a particularly important feature when keywords can generate hundreds of possibilities. The “results interface” allows researchers to determine if a given bill is relevant to a research project.

Our paper builds on the work of other scholars and organizations interested in state policy. To help begin a centralized space for data collection, Kevin Smith and Scott Granberg-Rademacker publicly invited “researchers to submit descriptions of data sources that were likely to be of interest to state politics and policy scholars,” calling for “centralized, comprehensive, and reliable datasets” that are easy to download and manipulate. In this spirit, Jason Sorens, Fait Muedini, and William Ruger introduced a free database that offered a comprehensive set of variables involving over 170 public policies at the state and local levels in order to “reduce reduplication of scholarly effort.” The National Conference of State Legislatures (NCSL) provides links to state legislatures, bill lists, constitutions, reports, and statutes for all fifty states. The State Legislative History Research Guides compiled by the University of Indiana Law School also include links to legislative and historical resources for the states, such as the Legislative Reference Library of Texas. However, to our knowledge, no existing resource assesses usability across all state websites.

So, what did we find during our assessment of the state websites? In general, we observed that the archival records as well as the search and results functions leave considerable room for improvement. The maximum possible score was 11 in each year, and the average was 3.87 in 2008 and 4.25 in 2010. For researchers interested in certain time periods, policy areas, committees, or sponsors, the inability to set filters, immediately see relevant results, and access past legislative sessions limits their ability to complete projects in a timely manner (or at all). We also found a great deal of variation in site features, content, and navigation. Greater standardization would improve access to information about state policymaking by researchers and the general public—although some legislators may well see benefits to opacity.

While we noted some progress over the study period, not all change was positive. By 2010, two states had scored 10 points (no state scored the full 11), fewer states had very low scores, and the average score rose slightly from 3.87 to 4.25 (out of 11). This suggests slow but steady improvement, and the provision of a baseline of support for researchers. However, a quarter of the states showed score drops over the study period, for the most part reflecting the adoption of “Powered by Google” search tools that used only keywords, and some in a very limited manner. If the latter becomes a trend, we could see websites becoming less, not more, user friendly in the future.

In addition, our index may serve as a proxy variable for state government transparency. While  the website scores were not statistically associated with Robert Erikson, Gerald Wright, and John McIver’s measure of state ideology, there may nevertheless be promise for future research along these lines; additional transparency determinants worth testing include legislative professionalism and social capital. Moving forward, the states might consider creating a working group to share ideas and best practices, perhaps through an organization like the National Conference of State Legislatures, rather than the national government, as some states might resist leadership from D.C. on federalist grounds.

Helen Margetts (2009) has noted that “The Internet has the capacity to provide both too much (which poses challenges to analysis) and too little data (which requires innovation to fill the gaps).” It is notable, and sometimes frustrating, that state legislative websites illustrate both dynamics. As datasets come online at an increasing rate, it is also easy to forget that websites can vary in terms of user friendliness, hierarchical structure, search terms and functions, terminology, and navigability — causing unanticipated methodological and data capture problems (i.e. headaches) to scholars working in this area.


Read the full paper: Taofang Huang, David Leal, B.J. Lee, and Jill Strube (2012) Assessing the Online Legislative Resources of the American States. Policy and Internet 4 (3-4).

Share this article


Note: This article gives the views of the authors, and not the position of the Policy and Internet Blog, nor of the Oxford Internet Institute.



One Response to How accessible are online legislative data archives to political scientists?

  1. Very interesting exercise. There’s a small network of people who work hard to scrape legislative data (here’s my own attempt at scraping the French lower chamber, written for this conference). There should be a way for us all to meet up and share what we have done and plan to do on that front.