The Internet, Policy & Politics Conferences

Oxford Internet Institute, University of Oxford

Hugo van Haastert: Big data ethics

Hugo van Haastert

The Netherlands Vehicle Authority (RDW) is exploring the use of big data methods to help carry out their tasks as a government agency responsible for vehicle safety. The use of big data is studied from a legal perspective: what is the relationship between using big data methods, specifically machine learning, and the principles of proper government, part of the Dutch Administrative Law.

Big data has two central components. The first looks at the nature of the data and the second looks at the data analysis. This thesis focuses on the latter: machine learning as a big data method to analyse data. Machine learning is a form of artificial intelligence: computer modeling of learning processes. The Centre for Internet and Human Rights (CIHR, p. 3) has raised concerns that complex algorithms like machine learning are ‘often practically inscrutable to outside observers.’Burrell (2016) and Pasquale (2014) have also pointed to the opaque, black box nature of machine learning algorithms. But there are also potential advantages to big data. Big data can help develop self driving cars, diagnose diseases and predict and detect fraud while reducing costs.

The RDW is exploring the use of machine learning in their processes of Periodical Technical Inspection and Import. This thesis looks at these possible applications of machine learning from the perspective of the principles of proper government. Two of these principles are singled out: motivation and diligence. The government must motivate their decisions and decisions should be made with diligence.

This thesis finds that the relationship between the big data applications being developed at the RDW and the principles of motivation and diligence is mostly negative and problematic. Specific nuances can be found upon closer examination of the two case studies. The two cases differ in the type of government decision, the extent to which decisions are man-made and the extent to which the big data applications fall within the established definition of big data. But the lack of transparency in how machine learning analyses data and how it comes to its calculations is difficult to reconcile with the principles of proper government.

Motivation especially seems to become difficult: without knowing how an analysis method looks at data and patterns within data, it is hard to base decisions on the output of that method and still remain transparent as a government. The business case for using big data is overwhelming. The promise of a data driven society with greater efficiency and efficacy is most appealing in a time of constant pressure on government to cut spending while improving public service. At the same time, there is also pressure to conform to Dutch law.

The choice the RDW has to make is a difficult one. Do they choose to improve efficiency of their processes and reduce costs. Or do they choose to place higher value on transparency, accountability and legal certainty. Fundamentally, in the exploration of IT innovation, there must be awareness that using IT and big data has ethical consequences.

References:

CIHR (2015) Ethics of Algorithms, https://www.gccs2015.com/sites/default/files/documents/Ethics_Algorithms-final%20doc.pdf

Burrell, J. (2016) How the Machine 'Thinks:' Understanding Opacity in Machine Learning Algorithms, Big Data & Society

Pasquale, F. (2014) The Black Box Society

PS This abstract refers to an MSc thesis on this subject

Authors: 
Hugo van Haastert