algorithms – The Policy and Internet Blog

Could Counterfactuals Explain Algorithmic Decisions Without Opening the Black Box?

David Sutcliffe — Mon, 15 Jan 2018 10:37:21 +0000

The EU General Data Protection Regulation (GDPR) has sparked much discussion about the “right to explanation” for the algorithm-supported decisions made about us in our everyday lives. While there’s an obvious need for transparency in the automated decisions that are increasingly being made in areas like policing, education, healthcare and recruitment, explaining how these complex algorithmic decision-making systems arrive at any particular decision is a technically challenging problem—to put it mildly.

In their article “Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR” which is forthcoming in the Harvard Journal of Law & Technology, Sandra Wachter, Brent Mittelstadt, and Chris Russell present the concept of “unconditional counterfactual explanations” as a novel type of explanation of automated decisions that could address many of these challenges. Counterfactual explanations describe the minimum conditions that would have led to an alternative decision (e.g. a bank loan being approved), without the need to describe the full logic of the algorithm.

Relying on counterfactual explanations as a means to help us act rather than merely to understand could help us gauge the scope and impact of automated decisions in our lives. They might also help bridge the gap between the interests of data subjects and data controllers, which might otherwise be a barrier to a legally binding right to explanation.

We caught up with the authors to explore the role of algorithms in our everyday lives, and how a “right to explanation” for decisions might be achievable in practice:

Ed: There’s a lot of discussion about algorithmic “black boxes” — where decisions are made about us, using data and algorithms about which we (and perhaps the operator) have no direct understanding. How prevalent are these systems?

Sandra: Basically, every decision that can be made by a human can now be made by an algorithm. Which can be a good thing. Algorithms (when we talk about artificial intelligence) are very good at spotting patterns and correlations that even experienced humans might miss, for example in predicting disease. They are also very cost efficient—they don’t get tired, and they don’t need holidays. This could help to cut costs, for example in healthcare.

Algorithms are also certainly more consistent than humans in making decisions. We have the famous example of judges varying the severity of their judgements depending on whether or not they’ve had lunch. That wouldn’t happen with an algorithm. That’s not to say algorithms are always going to make better decisions: but they do make more consistent ones. If the decision is bad, it’ll be distributed equally, but still be bad. Of course, in a certain way humans are also black boxes—we don’t understand what humans do either. But you can at least try to understand an algorithm: it can’t lie, for example.

Brent: In principle, any sector involving human decision-making could be prone to decision-making by algorithms. In practice, we already see algorithmic systems either making automated decisions or producing recommendations for human decision-makers in online search, advertising, shopping, medicine, criminal justice, etc. The information you consume online, the products you are recommended when shopping, the friends and contacts you are encouraged to engage with, even assessments of your likelihood to commit a crime in the immediate and long-term future—all of these tasks can currently be affected by algorithmic decision-making.

Ed: I can see that algorithmic decision-making could be faster and better than human decisions in many situations. Are there downsides?

Sandra: Simple algorithms that follow a basic decision tree (with parameters decided by people) can be easily understood. But we’re now also using much more complex systems like neural nets that act in a very unpredictable way, and that’s the problem. The system is also starting to become autonomous, rather than being under the full control of the operator. You will see the output, but not necessarily why it got there. This also happens with humans, of course: I could be told by a recruiter that my failure to land a job had nothing to do with my gender (even if it did); an algorithm, however, would not intentionally lie. But of course the algorithm might be biased against me if it’s trained on biased data—thereby reproducing the biases of our world.

We have seen that the COMPAS algorithm used by US judges to calculate the probability of re-offending when making sentencing and parole decisions is a major source of discrimination. Data provenance is massively important, and probably one of the reasons why we have biased decisions. We don’t necessarily know where the data comes from, and whether it’s accurate, complete, biased, etc. We need to have lots of standards in place to ensure that the data set is unbiased. Only then can the algorithm produce nondiscriminatory results.

A more fundamental problem with predictions is that you might never know what would have happened—as you’re just dealing with probabilities; with correlations in a population, rather than with causalities. Another problem is that algorithms might produce correct decisions, but not necessarily fair ones. We’ve been wrestling with the concept of fairness for centuries, without consensus. But lack of fairness is certainly something the system won’t correct itself—that’s something that society must correct.

Brent: The biases and inequalities that exist in the real world and in real people can easily be transferred to algorithmic systems. Humans training learning systems can inadvertently or purposefully embed biases into the model, for example through labelling content as ‘offensive’ or ‘inoffensive’ based on personal taste. Once learned, these biases can spread at scale, exacerbating existing inequalities. Eliminating these biases can be very difficult, hence we currently see much research done on the measurement of fairness or detection of discrimination in algorithmic systems.

These systems can also be very difficult—if not impossible—to understand, for experts as well as the general public. We might traditionally expect to be able to question the reasoning of a human decision-maker, even if imperfectly, but the rationale of many complex algorithmic systems can be highly inaccessible to people affected by their decisions. These potential risks aren’t necessarily reasons to forego algorithmic decision-making altogether; rather, they can be seen as potential effects to be mitigated through other means (e.g. a loan programme weighted towards historically disadvantaged communities), or at least to be weighed against the potential benefits when choosing whether or not to adopt a system.

Ed: So it sounds like many algorithmic decisions could be too complex to “explain” to someone, even if a right to explanation became law. But you propose “counterfactual explanations” as an alternative— i.e. explaining to the subject what would have to change (e.g. about a job application) for a different decision to be arrived at. How does this simplify things?

Brent: So rather than trying to explain the entire rationale of a highly complex decision-making process, counterfactuals allow us to provide simple statements about what would have needed to be different about an individual’s situation to get a different, preferred outcome. You basically work from the outcome: you say “I am here; what is the minimum I need to do to get there?” By providing simple statements that are generally meaningful, and that reveal a small bit of the rationale of a decision, the individual has grounds to change their situation or contest the decision, regardless of their technical expertise. Understanding even a bit of how a decision is made is better than being told “sorry, you wouldn’t understand”—at least in terms of fostering trust in the system.

Sandra: And the nice thing about counterfactuals is that they work with highly complex systems, like neural nets. They don’t explain why something happened, but they explain what happened. And three things people might want to know are:

(1) What happened: why did I not get the loan (or get refused parole, etc.)?

(2) Information so I can contest the decision if I think it’s inaccurate or unfair.

(3) Even if the decision was accurate and fair, tell me what I can do to improve my chances in the future.

Machine learning and neural nets make use of so much information that individuals have really no oversight of what they’re processing, so it’s much easier to give someone an explanation of the key variables that affected the decision. With the counterfactual idea of a “close possible world” you give an indication of the minimal changes required to get what you actually want.

Ed: So would a series of counterfactuals (e.g. “over 18” “no prior convictions” “no debt”) essentially define a space within which a certain decision is likely to be reached? This decision space could presumably be graphed quite easily, to help people understand what factors will likely be important in reaching a decision?

Brent: This would only work for highly simplistic, linear models, which are not normally the type that confound human capacities for understanding. The complex systems that we refer to as ‘black boxes’ are highly dimensional and involve a multitude of (probabilistic) dependencies between variables that can’t be graphed simply. It may be the case that if I were aged between 35-40 with an income of £30,000, I would not get a loan. But, I could be told that if I had an income of £35,000, I would have gotten the loan. I may then assume that an income over £35,000 guarantees me a loan in the future. But, it may turn out that I would be refused a loan with an income above £40,000 because of a change in tax bracket. Non-linear relationships of this type can make it misleading to graph decision spaces. For simple linear models, such a graph may be a very good idea, but not for black box systems; they could, in fact, be highly misleading.

Chris: As Brent says, we’re concerned with understanding complicated algorithms that don’t just use hard cut-offs based on binary features. To use your example, maybe a little bit of debt is acceptable, but it would increase your risk of default slightly, so the amount of money you need to earn would go up. Or maybe certain convictions committed in the past also only increase your risk of defaulting slightly, and can be compensated for with higher salary. It’s not at all obvious how you could graph these complicated interdependencies over many variables together. This is why we picked on counterfactuals as a way to give people a direct and easy to understand path to move from the decision they got now, to a more favourable one at a later date.

Ed: But could a counterfactual approach just end up kicking the can down the road, if we know “how” a particular decision was reached, but not “why” the algorithm was weighted in such a way to produce that decision?

Brent: It depends what we mean by “why”. If this is “why” in the sense of, why was the system designed this way, to consider this type of data for this task, then we should be asking these questions while these systems are designed and deployed. Counterfactuals address decisions that have already been made, but still can reveal uncomfortable knowledge about a system’s design and functionality. So it can certainly inform “why” questions.

Sandra: Just to echo Brent, we don’t want to imply that asking the “why” is unimportant—I think it’s very important, and interpretability as a field has to be pursued, particularly if we’re using algorithms in highly sensitive areas. Even if we have the “what”, the “why” question is still necessary to ensure the safety of those systems.

Chris: And anyone who’s talked to a three-year old knows there is an endless stream of “Why” questions that can be asked. But already, counterfactuals provide a major step forward in answering why, compared to previous approaches that were concerned with providing approximate descriptions of how algorithms make decisions—but not the “why” or the external facts leading to that decision. I think when judging the strength of an explanation, you also have to look at questions like “How easy is this to understand?” and “How does this help the person I’m explaining things to?” For me, counterfactuals are a more immediately useful explanation, than something which explains where the weights came from. Even if you did know, what could you do with that information?

Ed: I guess the question of algorithmic decision making in society involves a hugely complex intersection of industry, research, and policy making? Are we control of things?

Sandra: Artificial intelligence (and the technology supporting it) is an area where many sectors are now trying to work together, including in the crucial areas of fairness, transparency and accountability of algorithmic decision-making. I feel at the moment we see a very multi-stakeholder approach, and I hope that continues in the future. We can see for example that industry is very concerned with it—the Partnership in AI is addressing these topics and trying to come up with a set of industry guidelines, recognising the responsibilities inherent in producing these systems. There are also lots of data scientists (eg at the OII and Turing Institute) working on these questions. Policy-makers around the world (e.g. UK, EU, US, China) preparing their countries for the AI future, so it’s on everybody’s mind at the moment. It’s an extremely important topic.

Law and ethics obviously has an important role to play. The opacity, unpredictability of AI and its potentially discriminatory nature, requires that we think about the legal and ethical implications very early on. That starts with educating the coding community, and ensuring diversity. At the same time, it’s important to have an interdisciplinary approach. At the moment we’re focusing a bit too much on the STEM subjects; there’s a lot of funding going to those areas (which makes sense, obviously), but the social sciences are currently a bit neglected despite the major role they play in recognising things like discrimination and bias, which you might not recognise from just looking at code.

Brent: Yes—and we’ll need much greater interaction and collaboration between these sectors to stay ‘in control’ of things, so to speak. Policy always has a tendency to lag behind technological developments; the challenge here is to stay close enough to the curve to prevent major issues from arising. The potential for algorithms to transform society is massive, so ensuring a quicker and more reflexive relationship between these sectors than normal is absolutely critical.

Read the full article: Sandra Wachter, Brent Mittelstadt, Chris Russell (2018) Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology (Forthcoming).

This work was supported by The Alan Turing Institute under the EPSRC grant EP/N510129/1.

Sandra Wachter, Brent Mittelstadt and Chris Russell were talking to blog editor David Sutcliffe.

Of course social media is transforming politics. But it’s not to blame for Brexit and Trump

Helen Margetts — Mon, 09 Jan 2017 10:24:58 +0000

After Brexit and the election of Donald Trump, 2016 will be remembered as the year of cataclysmic democratic events on both sides of the Atlantic. Social media has been implicated in the wave of populism that led to both these developments.

Attention has focused on echo chambers, with many arguing that social media users exist in ideological filter bubbles, narrowly focused on their own preferences, prey to fake news and political bots, reinforcing polarization and leading voters to turn away from the mainstream. Mark Zuckerberg has responded with the strange claim that his company (built on $5 billion of advertising revenue) does not influence people’s decisions.

So what role did social media play in the political events of 2016?

Political turbulence and the new populism

There is no doubt that social media has brought change to politics. From the waves of protest and unrest in response to the 2008 financial crisis, to the Arab spring of 2011, there has been a generalized feeling that political mobilization is on the rise, and that social media had something to do with it.

Our book investigating the relationship between social media and collective action, Political Turbulence, focuses on how social media allows new, “tiny acts” of political participation (liking, tweeting, viewing, following, signing petitions and so on), which turn social movement theory around. Rather than identifying with issues, forming collective identity and then acting to support the interests of that identity – or voting for a political party that supports it – in a social media world, people act first, and think about it, or identify with others later, if at all.

These tiny acts of participation can scale up to large-scale mobilizations, such as demonstrations, protests or campaigns for policy change. But they almost always don’t. The overwhelming majority (99.99%) of petitions to the UK or US governments fail to get the 100,000 signatures required for a parliamentary debate (UK) or an official response (US).

The very few that succeed do so very quickly on a massive scale (petitions challenging the Brexit and Trump votes immediately shot above 4 million signatures, to become the largest petitions in history), but without the normal organizational or institutional trappings of a social or political movement, such as leaders or political parties – the reason why so many of the Arab Spring revolutions proved disappointing.

This explosive rise, non-normal distribution and lack of organization that characterizes contemporary politics can explain why many political developments of our time seem to come from nowhere. It can help to understand the shock waves of support that brought us the Italian Five Star Movement, Podemos in Spain, Jeremy Corbyn, Bernie Sanders, and most recently Brexit and Trump – all of which have campaigned against the “establishment” and challenged traditional political institutions to breaking point.

Each successive mobilization has made people believe that challengers from outside the mainstream are viable – and that is in part what has brought us unlikely results on both sides of the Atlantic. But it doesn’t explain everything.

We’ve had waves of populism before – long before social media (indeed many have made parallels between the politics of 2016 and that of the 1930s). While claims that social media feeds are the biggest threat to democracy, leading to the “disintegration of the general will” and “polarization that drives populism” abound, hard evidence is more difficult to find.

The myth of the echo chamber

The mechanism that is most often offered for this state of events is the existence of echo chambers or filter bubbles. The argument goes that first social media platforms feed people the news that is closest to their own ideological standpoint (estimated from their previous patterns of consumption) and second, that people create their own personalized information environments through their online behaviour, selecting friends and news sources that back up their world view.

Once in these ideological bubbles, people are prey to fake news and political bots that further reinforce their views. So, some argue, social media reinforces people’s current views and acts as a polarizing force on politics, meaning that “random exposure to content is gone from our diets of news and information”.

Really? Is exposure less random than before? Surely the most perfect echo chamber would be the one occupied by someone who only read the Daily Mail in the 1930s – with little possibility of other news – or someone who just watches Fox News? Can our new habitat on social media really be as closed off as these environments, when our digital networks are so very much larger and more heterogeneous than anything we’ve had before?

Research suggests not. A recent large-scale survey (of 50,000 news consumers in 26 countries) shows how those who do not use social media on average come across news from significantly fewer different online sources than those who do. Social media users, it found, receive an additional “boost” in the number of news sources they use each week, even if they are not actually trying to consume more news. These findings are reinforced by an analysis of Facebook data, where 8.8 billion posts, likes and comments were posted through the US election.

Recent research published in Science shows that algorithms play less of a role in exposure to attitude-challenging content than individuals’ own choices and that “on average more than 20% of an individual’s Facebook friends who report an ideological affiliation are from the opposing party”, meaning that social media exposes individuals to at least some ideologically cross-cutting viewpoints: “24% of the hard content shared by liberals’ friends is cross-cutting, compared to 35% for conservatives” (the equivalent figures would be 40% and 45% if random).

In fact, companies have no incentive to create hermetically sealed (as I have heard one commentator claim) echo chambers. Most of social media content is not about politics (sorry guys) – most of that £5 billion advertising revenue does not come from political organizations. So any incentives that companies have to create echo chambers – for the purposes of targeted advertising, for example – are most likely to relate to lifestyle choices or entertainment preferences, rather than political attitudes.

And where filter bubbles do exist they are constantly shifting and sliding – easily punctured by a trending cross-issue item (anybody looking at #Election2016 shortly before polling day would have seen a rich mix of views, while having little doubt about Trump’s impending victory).

And of course, even if political echo chambers were as efficient as some seem to think, there is little evidence that this is what actually shapes election results. After all, by definition echo chambers preach to the converted. It is the undecided people who (for example) the Leave and Trump campaigns needed to reach.

And from the research, it looks like they managed to do just that. A barrage of evidence suggests that such advertising was effective in the 2015 UK general election (where the Conservatives spent 10 times as much as Labour on Facebook advertising), in the EU referendum (where the Leave campaign also focused on paid Facebook ads) and in the presidential election, where Facebook advertising has been credited for Trump’s victory, while the Clinton campaign focused on TV ads. And of course, advanced advertising techniques might actually focus on those undecided voters from their conversations. This is not the bottom-up political mobilization that fired off support for Podemos or Bernie Sanders. It is massive top-down advertising dollars.

Ironically however, these huge top-down political advertising campaigns have some of the same characteristics as the bottom-up movements discussed above, particularly sustainability. Former New York Governor Mario Cuomo’s dictum that candidates “campaign in poetry and govern in prose” may need an update. Barack Obama’s innovative campaigns of online social networks, micro-donations and matching support were miraculous, but the extent to which he developed digital government or data-driven policy-making in office was disappointing. Campaign digitally, govern in analogue might be the new mantra.

Chaotic pluralism

Politics is a lot messier in the social media era than it used to be – whether something takes off and succeeds in gaining critical mass is far more random than it appears to be from a casual glance, where we see only those that succeed.

In Political Turbulence, we wanted to identify the model of democracy that best encapsulates politics intertwined with social media. The dynamics we observed seem to be leading us to a model of “chaotic pluralism”, characterized by diversity and heterogeneity – similar to early pluralist models – but also by non-linearity and high interconnectivity, making liberal democracies far more disorganized, unstable and unpredictable than the architects of pluralist political thought ever envisaged.

Perhaps rather than blaming social media for undermining democracy, we should be thinking about how we can improve the (inevitably major) part that it plays.

Within chaotic pluralism, there is an urgent need for redesigning democratic institutions that can accommodate new forms of political engagement, and respond to the discontent, inequalities and feelings of exclusion – even anger and alienation – that are at the root of the new populism. We should be using social media to listen to (rather than merely talk at) the expression of these public sentiments, and not just at election time.

Many political institutions – for example, the British Labour Party, the US Republican Party, and the first-past-the-post electoral system shared by both countries – are in crisis, precisely because they have become so far removed from the concerns and needs of citizens. Redesign will need to include social media platforms themselves, which have rapidly become established as institutions of democracy and will be at the heart of any democratic revival.

As these platforms finally start to admit to being media companies (rather than tech companies), we will need to demand human intervention and transparency over algorithms that determine trending news; factchecking (where Google took the lead); algorithms that detect fake news; and possibly even “public interest” bots to counteract the rise of computational propaganda.

Meanwhile, the only thing we can really predict with certainty is that unpredictable things will happen and that social media will be part of our political future.

Discussing the echoes of the 1930s in today’s politics, the Wall Street Journal points out how Roosevelt managed to steer between the extremes of left and right because he knew that “public sentiments of anger and alienation aren’t to be belittled or dismissed, for their causes can be legitimate and their consequences powerful”. The path through populism and polarization may involve using the opportunity that social media presents to listen, understand and respond to these sentiments.

This piece draws on research from Political Turbulence: How Social Media Shape Collective Action (Princeton University Press, 2016), by Helen Margetts, Peter John, Scott Hale and Taha Yasseri.

It is cross-posted from the World Economic Forum, where it was first published on 22 December 2016.

Should there be a better accounting of the algorithms that choose our news for us?

David Sutcliffe — Wed, 07 Dec 2016 14:44:31 +0000

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information — and content personalization systems (think search engines, social media feeds and targeted advertising), and the algorithms they rely upon, create a new type of curated media that can undermine the fairness and quality of political discourse.

A new article by Brent Mittlestadt explores the challenges of enforcing a political right to transparency in content personalization systems. Firstly, he explains the value of transparency to political discourse and suggests how content personalization systems undermine open exchange of ideas and evidence among participants: at a minimum, personalization systems can undermine political discourse by curbing the diversity of ideas that participants encounter. Second, he explores work on the detection of discrimination in algorithmic decision making, including techniques of algorithmic auditing that service providers can employ to detect political bias. Third, he identifies several factors that inhibit auditing and thus indicate reasonable limitations on the ethical duties incurred by service providers — content personalization systems can function opaquely and be resistant to auditing because of poor accessibility and interpretability of decision-making frameworks. Finally, Brent concludes with reflections on the need for regulation of content personalization systems.

He notes that no matter how auditing is pursued, standards to detect evidence of political bias in personalized content are urgently required. Methods are needed to routinely and consistently assign political value labels to content delivered by personalization systems. This is perhaps the most pressing area for future work—to develop practical methods for algorithmic auditing.

The right to transparency in political discourse may seem unusual and farfetched. However, standards already set by the U.S. Federal Communication Commission’s fairness doctrine — no longer in force — and the British Broadcasting Corporation’s fairness principle both demonstrate the importance of the idealized version of political discourse described here. Both precedents promote balance in public political discourse by setting standards for delivery of politically relevant content. Whether it is appropriate to hold service providers that use content personalization systems to a similar standard remains a crucial question.

Read the full article: Mittelstadt, B. (2016) Auditing for Transparency in Content Personalization Systems. International Journal of Communication 10(2016), 4991–5002.

We caught up with Brent to explore the broader implications of the study:

Ed: We basically accept that the tabloids will be filled with gross bias, populism and lies (in order to sell copy) — and editorial decisions are not generally transparent to us. In terms of their impact on the democratic process, what is the difference between the editorial boardroom and a personalising social media algorithm?

Brent: There are a number of differences. First, although not necessarily transparent to the public, one hopes that editorial boardrooms are at least transparent to those within the news organisations. Editors can discuss and debate the tone and factual accuracy of their stories, explain their reasoning to one another, reflect upon the impact of their decisions on their readers, and generally have a fair debate about the merits and weaknesses of particular content.

This is not the case for a personalising social media algorithm; those working with the algorithm inside a social media company are often unable to explain why the algorithm is functioning in a particular way, or determined a particular story or topic to be ‘trending’ or displayed to particular users, while others are not. It is also far more difficult to ‘fact check’ algorithmically curated news; a news item can be widely disseminated merely by many users posting or interacting with it, without any purposeful dissemination or fact checking by the platform provider.

Another big difference is the degree to which users can be aware of the bias of the stories they are reading. Whereas a reader of The Daily Mail or The Guardian will have some idea of the values of the paper, the same cannot be said of platforms offering algorithmically curated news and information. The platform can be neutral insofar as it disseminates news items and information reflecting a range of values and political viewpoints. A user will encounter items reflecting her particular values (or, more accurately, her history of interactions with the platform and the values inferred from them), but these values, and their impact on her exposure to alternative viewpoints, may not be apparent to the user.

Ed: And how is content “personalisation” different to content filtering (e.g. as we see with the Great Firewall of China) that people get very worked up about? Should we be more worried about personalisation?

Brent: Personalisation and filtering are essentially the same mechanism; information is tailored to a user or users according to some prevailing criteria. One difference is whether content is merely infeasible to access, or technically inaccessible. Content of all types will typically still be accessible in principle when personalisation is used, but the user will have to make an effort to access content that is not recommended or otherwise given special attention. Filtering systems, in contrast, will impose technical measures to make particular content inaccessible from a particular device or geographical area.

Another difference is the source of the criteria used to set the visibility of different types of content. In the case of personalisation, these criteria are typically based on the users (inferred) interests, values, past behaviours and explicit requests. Critically, these values are not necessarily apparent to the user. For filtering, criteria are typically externally determined by a third party, often a government. Some types of information are set off limits, according to the prevailing values of the third party. It is the imposition of external values, which limit the capacity of users to access content of their choosing, which often causes an outcry against filtering and censorship.

Importantly, the two mechanisms do not necessarily differ in terms of the transparency of the limiting factors or rules to users. In some cases, such as the recently proposed ban in the UK of adult websites that do not provide meaningful age verification mechanisms, the criteria that determine whether sites are off limits will be publicly known at a general level. In other cases, and especially with personalisation, the user inside the ‘filter bubble’ will be unaware of the rules that determine whether content is (in)accessible. And it is not always the case that the platform provider intentionally keeps these rules secret. Rather, the personalisation algorithms and background analytics that determine the rules can be too complex, inaccessible or poorly understood even by the provider to give the user any meaningful insight.

Ed: Where are these algorithms developed: are they basically all proprietary? i.e. how would you gain oversight of massively valuable and commercially sensitive intellectual property?

Brent: Personalisation algorithms tend to be proprietary, and thus are not normally open to public scrutiny in any meaningful sense. In one sense this is understandable; personalisation algorithms are valuable intellectual property. At the same time the lack of transparency is a problem, as personalisation fundamentally affects how users encounter and digest information on any number of topics. As recently argued, it may be the case that personalisation of news impacts on political and democratic processes. Existing regulatory mechanisms have not been successful in opening up the ‘black box’ so to speak.

It can be argued, however, that legal requirements should be adopted to require these algorithms to be open to public scrutiny due to the fundamental way they shape our consumption of news and information. Oversight can take a number of forms. As I argue in the article, algorithmic auditing is one promising route, performed both internally by the companies themselves, and externally by a government agency or researchers. A good starting point would be for the companies developing and deploying these algorithms to extend their cooperation with researchers, thereby allowing a third party to examine the effects these systems are having on political discourse, and society more broadly.

Ed: By “algorithm audit” — do you mean examining the code and inferring what the outcome might be in terms of bias, or checking the outcome (presumably statistically) and inferring that the algorithm must be introducing bias somewhere? And is it even possible to meaningfully audit personalisation algorithms, when they might rely on vast amounts of unpredictable user feedback to train the system?

Brent: Algorithm auditing can mean both of these things, and more. Audit studies are a tool already in use, whereby human participants introduce different inputs into a system, and examine the effect on the system’s outputs. Similar methods have long been used to detect discriminatory hiring practices, for instance. Code audits are another possibility, but are generally prohibitive due to problems of access and complexity. Also, even if you can access and understand the code of an algorithm, that tells you little about how the algorithm performs in practice when given certain input data. Both the algorithm and input data would need to be audited.

Alternatively, auditing can assess just the outputs of the algorithm; recent work to design mechanisms to detect disparate impact and discrimination, particularly in the Fairness, Accountability and Transparency in Machine Learning (FAT-ML) community, is a great example of this type of auditing. Algorithms can also be designed to attempt to prevent or detect discrimination and other harms as they occur. These methods are as much about the operation of the algorithm, as they are about the nature of the training and input data, which may itself be biased. In short, auditing is very difficult, but there are promising avenues of research and development. Once we have reliable auditing methods, the next major challenge will be to tailor them to specific sectors; a one-size-meets-all approach to auditing is not on the cards.

Ed: Do you think this is a real problem for our democracy? And what is the solution if so?

Brent: It’s difficult to say, in part because access and data to study the effects of personalisation systems are hard to come by. It is one thing to prove that personalisation is occurring on a particular platform, or to show that users are systematically displayed content reflecting a narrow range of values or interests. It is quite another to prove that these effects are having an overall harmful effect on democracy. Digesting information is one of the most basic elements of social and political life, so any mechanism that fundamentally changes how information is encountered should be subject to serious and sustained scrutiny.

Assuming personalisation actually harms democracy or political discourse, mitigating its effects is quite a different issue. Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished.

At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users. A promising step would be proactively giving the user some idea of what the system thinks it knows about them, or how they are being classified or profiled, without the user first needing to ask.

Brent Mittelstadt was talking to blog editor David Sutcliffe.

Is Social Media Killing Democracy?

Phil Howard — Tue, 15 Nov 2016 08:46:10 +0000

Donald Trump in Reno, Nevada, by Darron Birgenheier (Flickr).

This is the big year for computational propaganda — using immense data sets to manipulate public opinion over social media. Both the Brexit referendum and US election have revealed the limits of modern democracy, and social media platforms are currently setting those limits.

Platforms like Twitter and Facebook now provide a structure for our political lives. We’ve always relied on many kinds of sources for our political news and information. Family, friends, news organizations, charismatic politicians certainly predate the internet. But whereas those are sources of information, social media now provides the structure for political conversation. And the problem is that these technologies permit too much fake news, encourage our herding instincts, and aren’t expected to provide public goods.

First, social algorithms allow fake news stories from untrustworthy sources to spread like wildfire over networks of family and friends. Many of us just assume that there is a modicum of truth-in-advertising. We expect this from advertisements for commercial goods and services, but not from politicians and political parties. Occasionally a political actor gets punished for betraying the public trust through their misinformation campaigns. But in the United States “political speech” is completely free from reasonable public oversight, and in most other countries the media organizations and public offices for watching politicians are legally constrained, poorly financed, or themselves untrustworthy. Research demonstrates that during the campaigns for Brexit and the U.S. presidency, large volumes of fake news stories, false factoids, and absurd claims were passed over social media networks, often by Twitter’s highly automated accounts and Facebook’s algorithms.

Second, social media algorithms provide very real structure to what political scientists often call “elective affinity” or “selective exposure”. When offered the choice of who to spend time with or which organizations to trust, we prefer to strengthen our ties to the people and organizations we already know and like. When offered a choice of news stories, we prefer to read about the issues we already care about, from pundits and news outlets we’ve enjoyed in the past. Random exposure to content is gone from our diets of news and information. The problem is not that we have constructed our own community silos — humans will always do that. The problem is that social media networks take away the random exposure to new, high-quality information.

This is not a technological problem. We are social beings and so we will naturally look for ways to socialize, and we will use technology to socialize each other. But technology could be part of the solution. A not-so-radical redesign might occasionally expose us to new sources of information, or warn us when our own social networks are getting too bounded.

The third problem is that technology companies, including Facebook and Twitter, have been given a “moral pass” on the obligations we hold journalists and civil society groups to.

In most democracies, the public policy and exit polling systems have been broken for a decade. Many social scientists now find that big data, especially network data, does a better job of revealing public preferences than traditional random digit dial systems. So Facebook actually got a moral pass twice this year. Their data on public opinion would have certainly informed the Brexit debate, and their data on voter preferences would certainly have informed public conversation during the US election.

Facebook has run several experiments now, published in scholarly journals, demonstrating that they have the ability to accurately anticipate and measure social trends. Whereas journalists and social scientists feel an obligation to openly analyze and discuss public preferences, we do not expect this of Facebook. The network effects that clearly were unmeasured by pollsters were almost certainly observable to Facebook. When it comes to news and information about politics, or public preferences on important social questions, Facebook has a moral obligation to share data and prevent computational propaganda. The Brexit referendum and US election have taught us that Twitter and Facebook are now media companies. Their engineering decisions are effectively editorial decisions, and we need to expect more openness about how their algorithms work. And we should expect them to deliberate about their editorial decisions.

There are some ways to fix these problems. Opaque software algorithms shape what people find in their news feeds. We’ve all noticed fake news stories (often called clickbait), and while these can be an entertaining part of using the internet, it is bad when they are used to manipulate public opinion. These algorithms work as “bots” on social media platforms like Twitter, where they were used in both the Brexit and US presidential campaign to aggressively advance the case for leaving Europe and the case for electing Trump. Similar algorithms work behind the scenes on Facebook, where they govern what content from your social networks actually gets your attention.

So the first way to strengthen democratic practices is for academics, journalists, policy makers and the interested public to audit social media algorithms. Was Hillary Clinton really replaced by an alien in the final weeks of the 2016 campaign? We all need to be able to see who wrote this story, whether or not it is true, and how it was spread. Most important, Facebook should not allow such stories to be presented as news, much less spread. If they take ad revenue for promoting political misinformation, they should face the same regulatory punishments that a broadcaster would face for doing such a public disservice.

The second problem is a social one that can be exacerbated by information technologies. This means it can also be mitigated by technologies. Introducing random news stories and ensuring exposure to high quality information would be a simple — and healthy — algorithmic adjustment to social media platforms. The third problem could be resolved with moral leadership from within social media firms, but a little public policy oversight from elections officials and media watchdogs would help. Did Facebook see that journalists and pollsters were wrong about public preferences? Facebook should have told us if so, and shared that data.

Social media platforms have provided a structure for spreading around fake news, we users tend to trust our friends and family, and we don’t hold media technology firms accountable for degrading our public conversations. The next big thing for technology evolution is the Internet of Things, which will generate massive amounts of data that will further harden these structures. Is social media damaging democracy? Yes, but we can also use social media to save democracy.

Alan Turing Institute and OII: Summit on Data Science for Government and Policy Making

Helen Margetts — Tue, 31 May 2016 06:45:39 +0000

The benefits of big data and data science for the private sector are well recognised. So far, considerably less attention has been paid to the power and potential of the growing field of data science for policy-making and public services. On Monday 14th March 2016 the Oxford Internet Institute (OII) and the Alan Turing Institute (ATI) hosted a Summit on Data Science for Government and Policy Making, funded by the EPSRC. Leading policy makers, data scientists and academics came together to discuss how the ATI and government could work together to develop data science for the public good. The convenors of the Summit, Professors Helen Margetts (OII) and Tom Melham (Computer Science), report on the day’s proceedings.

The Alan Turing Institute will build on the UK’s existing academic strengths in the analysis and application of big data and algorithm research to place the UK at the forefront of world-wide research in data science. The University of Oxford is one of five university partners, and the OII is the only partnering department in the social sciences. The aim of the summit on Data Science for Government and Policy-Making was to understand how government can make better use of big data and the ATI – with the academic partners in listening mode.

We hoped that the participants would bring forward their own stories, hopes and fears regarding data science for the public good. Crucially, we wanted to work out a roadmap for how different stakeholders can work together on the distinct challenges facing government, as opposed to commercial organisations. At the same time, data science research and development has much to gain from the policy-making community. Some of the things that government does – collect tax from the whole population, or give money away at scale, or possess the legitimate use of force – it does by virtue of being government. So the sources of data and some of the data science challenges that public agencies face are unique and tackling them could put government working with researchers at the forefront of data science innovation.

During the Summit a range of stakeholders provided insight from their distinctive perspectives; the Government Chief Scientific Advisor, Sir Mark Walport; Deputy Director of the ATI, Patrick Wolfe; the National Statistician and Director of ONS, John Pullinger; Director of Data at the Government Digital Service, Paul Maltby. Representatives of frontline departments recounted how algorithmic decision-making is already bringing predictive capacity into operational business, improving efficiency and effectiveness.

Discussion revolved around the challenges of how to build core capability in data science across government, rather than outsourcing it (as happened in an earlier era with information technology) or confining it to a data science profession. Some delegates talked of being in the ‘foothills’ of data science. The scale, heterogeneity and complexity of some government departments currently works against data science innovation, particularly when larger departments can operate thousands of databases, creating legacy barriers to interoperability. Out-dated policies can work against data science methodologies. Attendees repeatedly voiced concerns about sharing data across government departments, in some case because of limitations of legal protections; in others because people were unsure what they can and cannot do.

The potential power of data science creates an urgent need for discussion of ethics. Delegates and speakers repeatedly affirmed the importance of an ethical framework and for thought leadership in this area, so that ethics is ‘part of the science’. The clear emergent option was a national Council for Data Ethics (along the lines of the Nuffield Council for Bioethics) convened by the ATI, as recommended in the recent Science and Technology parliamentary committee report The big data dilemma and the government response. Luciano Floridi (OII’s professor of the philosophy and ethics of information) warned that we cannot reduce ethics to mere compliance. Ethical problems do not normally have a single straightforward ‘right’ answer, but require dialogue and thought and extend far beyond individual privacy. There was consensus that the UK has the potential to provide global thought leadership and to set the standard for the rest of Europe. It was announced during the Summit that an ATI Working Group on the Ethics of Data Science has been confirmed, to take these issues forward.

So what happens now?

Throughout the Summit there were calls from policy makers for more data science leadership. We hope that the ATI will be instrumental in providing this, and an interface both between government, business and academia, and between separate Government departments. This Summit showed just how much real demand – and enthusiasm – there is from policy makers to develop data science methods and harness the power of big data. No-one wants to repeat with data science the history of government information technology – where in the 1950s and 60s, government led the way as an innovator, but has struggled to maintain this position ever since. We hope that the ATI can act to prevent the same fate for data science and provide both thought leadership and the ‘time and space’ (as one delegate put it) for policy-makers to work with the Institute to develop data science for the public good.

So since the Summit, in response to the clear need that emerged from the discussion and other conversations with stakeholders, the ATI has been designing a Policy Innovation Unit, with the aim of working with government departments on ‘data science for public good’ issues. Activities could include:

Secondments at the ATI for data scientists from government
Short term projects in government departments for ATI doctoral students and postdoctoral researchers
Developing ATI as an accredited data facility for public data, as suggested in the current Cabinet Office consultation on better use of data in government
ATI pilot policy projects, using government data
Policy symposia focused on specific issues and challenges
ATI representation in regular meetings at the senior level (for example, between Chief Scientific Advisors, the Cabinet Office, the Office for National Statistics, GO-Science).
ATI acting as an interface between public and private sectors, for example through knowledge exchange and the exploitation of non-government sources as well as government data
ATI offering a trusted space, time and a forum for formulating questions and developing solutions that tackle public policy problems and push forward the frontiers of data science
ATI as a source of cross-fertilization of expertise between departments
Reviewing the data science landscape in a department or agency, identifying feedback loops – or lack thereof – between policy-makers, analysts, front-line staff and identifying possibilities for an ‘intelligent centre’ model through strategic development of expertise.

The Summit, and a series of Whitehall Roundtables convened by GO-Science which led up to it, have initiated a nascent network of stakeholders across government, which we aim to build on and develop over the coming months. If you are interested in being part of this, please do be in touch with us

Helen Margetts, Oxford Internet Institute, University of Oxford (director@oii.ox.ac.uk)

Tom Melham, Department of Computer Science, University of Oxford

P-values are widely used in the social sciences, but often misunderstood: and that’s a problem.

taha yasseri — Mon, 07 Mar 2016 18:53:29 +0000

P-values are widely used in the social sciences, especially ‘big data’ studies, to calculate statistical significance. Yet they are widely criticized for being easily hacked, and for not telling us what we want to know. Many have argued that, as a result, research is wrong far more often than we realize. In their recent article P-values: Misunderstood and Misused OII Research Fellow Taha Yasseri and doctoral student Bertie Vidgen argue that we need to make standards for interpreting p-values more stringent, and also improve transparency in the academic reporting process, if we are to maximise the value of statistical analysis.

“Significant”: an illustration of selective reporting and
statistical significance from XKCD. Available online at
http://xkcd.com/882/

In an unprecedented move, the American Statistical Association recently released a statement (March 7 2016) warning against how p-values are currently used. This reflects a growing concern in academic circles that whilst a lot of attention is paid to the huge impact of big data and algorithmic decision-making, there is considerably less focus on the crucial role played by statistics in enabling effective analysis of big data sets, and making sense of the complex relationships contained within them. Because much as datafication has created huge social opportunities, it has also brought to the fore many problems and limitations with current statistical practices. In particular, the deluge of data has made it crucial that we can work out whether studies are ‘significant’. In our paper, published three days before the ASA’s statement, we argued that the most commonly used tool in the social sciences for calculating significance – the p-value – is misused, misunderstood and, most importantly, doesn’t tell us what we want to know.

The basic problem of ‘significance’ is simple: it is simply unpractical to repeat an experiment an infinite number of times to make sure that what we observe is “universal”. The same applies to our sample size: we are often unable to analyse a “whole population” sample and so have to generalize from our observations on a limited size sample to the whole population. The obvious problem here is that what we observe is based on a limited number of experiments (sometimes only one experiment) and from a limited size sample, and as such could have been generated by chance rather than by an underlying universal mechanism! We might find it impossible to make the same observation if we were to replicate the same experiment multiple times or analyse a larger sample. If this is the case then we will mischaracterise what is happening – which is a really big problem given the growing importance of ‘evidence-based’ public policy. If our evidence is faulty or unreliable then we will create policies, or intervene in social settings, in an equally faulty way.

The way that social scientists have got round this problem (that samples might not be representative of the population) is through the ‘p-value’. The p-value tells you the probability of making a similar observation in a sample with the same size and in the same number of experiments, by pure chance In other words, it is actually telling you is how likely it is that you would see the same relationship between X and Y even if no relationship exists between them. On the face of it this is pretty useful, and in the social sciences we normally say that a p-value of 1 in 20 means the results are significant. Yet as the American Statistical Association has just noted, even though they are incredibly widespread many researchers mis-interpret what p-values really mean.

In our paper we argued that p-values are misunderstood and misused because people think the p-value tells you much more than it really does. In particular, people think the p-value tells you (i) how likely it is that a relationship between X and Y really exists and (ii) the percentage of all findings that are false (which is actually something different called the False Discovery Rate). As a result, we are far too confident that academic studies are correct. Some commentators have argued that at least 30% of studies are wrong because of problems related to p-values: a huge figure. One of the main problems is that p-values can be ‘hacked’ and as such easily manipulated to show significance when none exists.

If we are going to base public policy (and as such public funding) on ‘evidence’ then we need to make sure that the evidence used is reliable. P-values need to be used far more rigorously, with significance levels of 0.01 or 0.001 seen as standard. We also need to start being more open and transparent about how results are recorded. It is a fine line between data exploration (a legitimate academic exercise) and ‘data dredging’ (where results are manipulated in order to find something noteworthy). Only if researchers are honest about what they are doing will we be able to maximise the potential benefits offered by Big Data. Luckily there are some great initiatives – like the Open Science Framework – which improve transparency around the research process, and we fully endorse researchers making use of these platforms.

Scientific knowledge advances through corroboration and incremental progress, and it is crucial that we use and interpret statistics appropriately to ensure this progress continues. As our knowledge and use of big data methods increase, we need to ensure that our statistical tools keep pace.

Read the full paper: Vidgen, B. and Yasseri, T., (2016) P-values: Misunderstood and Misused, Frontiers in Physics, 4:6. http://dx.doi.org/10.3389/fphy.2016.00006

Bertie Vidgen is a doctoral student at the Oxford Internet Institute researching far-right extremism in online contexts. He is supervised by Dr Taha Yasseri, a research fellow at the Oxford Internet Institute interested in how Big Data can be used to understand human dynamics, government-society interactions, mass collaboration, and opinion dynamics.