news media – The Policy and Internet Blog

Does Twitter now set the news agenda?

David Sutcliffe — Mon, 10 Jul 2017 08:30:28 +0000

The information provided in the traditional media is of fundamental importance for the policy-making process, signalling which issues are gaining traction, which are falling out of favour, and introducing entirely new problems for the public to digest. But the monopoly of the traditional media as a vehicle for disseminating information about the policy agenda is being superseded by social media, with Twitter in particular used by politicians to influence traditional news content.

In their Policy & Internet article, “Politicians and the Policy Agenda: Does Use of Twitter by the U.S. Congress Direct New York Times Content?” Matthew A. Shapiro and Libby Hemphill examine the extent to which he traditional media is influenced by politicians’ Twitter posts. They draw on indexing theory, which states that media coverage and framing of key policy issues will tend to track elite debate. To understand why the newspaper covers an issue and predict the daily New York Times content, it is modelled as a function of all of the previous day’s policy issue areas as well as all of the previous day’s Twitter posts about all of the policy issue areas by Democrats and Republicans.

They ask to what extent are the agenda-setting efforts of members of Congress acknowledged by the traditional media; what, if any, the advantages are for one party over the other, measured by the traditional media’s increased attention; and whether there is any variance across different policy issue areas? They find that Twitter is a legitimate political communication vehicle for US officials, that journalists consider Twitter when crafting their coverage, and that Twitter-based announcements by members of Congress are a valid substitute for the traditional communiqué in journalism, particularly for issues related to immigration and marginalized groups, and issues related to the economy and health care.

We caught up with the authors to discuss their findings:

Ed.: Can you give a quick outline of media indexing theory? Does it basically say that the press reports whatever the elite are talking about? (i.e. that press coverage can be thought of as a simple index, which tracks the many conversations that make up elite debate).

Matthew: Indexing theory, in brief, states that the content of media reports reflects the degree to which elites – politicians and leaders in government in particular – are in agreement or disagreement. The greater the level of agreement or consensus among elites, the less news there is to report in terms of elite conflict. This is not to say that a consensus among elites is not newsworthy; indexing theory conveys how media reporting is a function of the multiple voices that exist when there is elite debate.

Ed.: You say Twitter seemed a valid measure of news indexing (i.e. coverage) for at least some topics. Could it be that the NYT isn’t following Twitter so much as Twitter (and the NYT) are both following something else, i.e. floor debates, releases, etc.?

Matthew: We can’t test for whether the NYT is following Twitter rather than floor debates/press releases without collecting data for the latter. Perhaps If the House and Senate Press Galleries are indexing the news based on House and Senate debates, and if Twitter posts by members of Congress reflect the House and Senate discussions, we could still argue that Twitter remains significant because there are no limits on the amount of discussion – i.e. the boundaries of the House and Senate floors no longer exist – and the media are increasingly reliant on politicians’ use of Twitter to communicate to the press. In any case, the existing research shows that journalists are increasingly relying on Twitter posts for updates from elites.

Ed.: I’m guessing that indexing theory only really works for non-partisan media that follow elite debates, like the NYT? Or does it also work for tabloids? And what about things like Breitbart (and its ilk) .. which I’m guessing appeals explicitly to a populist audience, rather than particularly caring what the elite are talking about?

Matthew: If a study similar to our was done to examine the indexing tendencies of tabloids, Breitbart, or a similar type of media source, the first step would be to determine what is being discussed regularly in these outlets. Assuming, for example, that there isn’t much discussion about marginalized groups in Breitbart, in the context of indexing theory it would not be relevant to examine the pool of congressional Twitter posts mentioning marginalized groups. Those posts are effectively off of Breitbart’s radar. But, generally, indexing theory breaks down if partisanship and bias drive the reporting.

Ed.: Is there any sense in which Trump’s “Twitter diplomacy” has overturned or rendered moot the recent literature on political uses of Twitter? We now have a case where a single (personal) Twitter account can upset the stock market — how does one theorise that?

Matthew: In terms of indexing theory, we could argue that Trump’s Twitter posts themselves generate a response from Democrats and Republicans in Congress and thus muddy the waters by conflating policy issues with other issues like his personality, ties to Russia, his fact-checking problems, etc. This is well beyond our focus in the article, but we speculate that Trump’s early-dawn use of Twitter is primarily for marketing, damage control, and deflection. There are really many different ways to study this phenomenon. One could, for example, examine the function of unfiltered news from politician to the public and compare it with the news that is simultaneously reported in the media. We would also be interested in understanding why Trump and politicians like Trump frame their Twitter posts the way they do, what effect these posts have on their devoted followers as well as their fence-sitting followers, and how this mobilizes Congress both online (i.e. on Twitter) and when discussing and voting on policy options on the Senate and House floors. These areas of research would all build upon rather than render moot the extant literature on the political uses of Twitter.

Ed.: Following on: how does Indexing theory deal with Trump’s populism (i.e. avowedly anti-Washington position), hatred and contempt of the media, and apparent aim of bypassing the mainstream press wherever possible: even ditching the press pool and favouring populist outlets over the NYT in press gaggles. Or is the media bigger than the President .. will indexing theory survive Trump?

Matthew: Indexing theory will of course survive Trump. What we are witnessing in the media is an inability, however, to limit gaper’s block in the sense that the media focus on the more inflammatory and controversial aspects of Trump’s Twitter posts – unfortunately on a daily basis – rather than reporting the policy implications. The media have to report what is news, and Presidential Twitter posts are now newsworthy, but we would argue that we are reaching a point where anything but the meat of the policy implications must be effectively filtered. Until we reach a point where the NYT ignores the inflammatory nature of Trumps Twitter posts, it will be challenging to test indexing theory in the context of the policy agenda setting process.

Ed.: There are recent examples (Brexit, Trump) of the media apparently getting things wrong because they were following the elites and not “the forgotten” (or deplorable) .. who then voted in droves. Is there any sense in the media industry that it needs to rethink things a bit — i.e. that maybe the elite is not always going to be in control of events, or even be an accurate bellwether?

Matthew: This question highlights an omission from our article, namely that indexing theory marginalizes the role of non-elite voices. We agree that the media could do a better job reporting on certain things; for instance, relying extensively on weather vanes of public opinion that do not account for inaccurate self-reporting (i.e. people not accurately representing themselves when being polled about their support for Trump, Brexit, etc.) or understanding why disenfranchised voters might opt to stay home on Election Day. When it comes to setting the policy agenda, which is the focus of our article, we stand by indexing theory given our assumption that the policy process itself is typically directed from those holding power. On that point, and regardless of whether it is normatively appropriate, elites are accurate bellwethers of the policy agenda.

Read the full article: Shapiro, M.A. and Hemphill, L. (2017) Politicians and the Policy Agenda: Does Use of Twitter by the U.S. Congress Direct New York Times Content? Policy & Internet 9 (1) doi:10.1002/poi3.120.

Matthew A. Shapiro and Libby Hemphill were talking to blog editor David Sutcliffe.

Social media and the battle for perceptions of the U.S.–Mexico border

David Sutcliffe — Wed, 07 Jun 2017 07:33:34 +0000

The US-Mexican border region is home to approximately 12 million people, and is the most-crossed international border in the world. Unlike the current physical border, the image people hold of “the border” is not firmly established, and can be modified. One way is via narratives (or stories), which are a powerful tool for gaining support for public policies. Politicians’ narratives about the border have historically been perpetuated by the traditional media, particularly when this allows them to publish sensational and attention grabbing news stories.

However, new social media, including YouTube, provide opportunities for less-mainstream narratives of cooperation. In their Policy & Internet article “Do New Media Support New Policy Narratives? The Social Construction of the U.S.–Mexico Border on YouTube”, Donna L. Lybecker, Mark K. McBeth, Maria A. Husmann, and Nicholas Pelikan find that YouTube videos about the U.S.–Mexico border focus (perhaps unsurprisingly) on mainstream, divisive issues such as security and violence, immigration, and drugs. However, the videos appear to construct more favourable perspectives of the border region than traditional media, with around half constructing a sympathetic view of the border, and the people associated with it.

The common perceptions of the border generally take two distinct forms. One holds the U.S.–Mexico border to be the location of an annual legal flow of economic trade of $300 billion each year, a line which millions of people legally cross annually, the frontier of 100 years of peaceful coexistence between two countries, and the point of integration for the U.S.–Mexico relationship. An alternative perspective (particularly common since 9/11) focuses less on economic trade and legal crossing and more on undocumented immigration, violence and drug wars, and a U.S.-centric view of “us versus them”.

In order to garner public support for their “solutions” to these issues, politicians often define the border using one of these perspectives. Acceptance of the first view might well allow policymakers to find cooperative solutions to joint problems. Acceptance of the second creates a policy problem that is more value-laden than empirically based and that creates distrust and polarization among stakeholders and between the countries. The U.S.–Mexico border is clearly a complex region encompassing both positives and negatives — but understanding these narratives could have a real-world impact on policy along the border; possibly creating the greater cooperation we need to solve many of the urgent problems faced by border communities.

We caught up with the authors to discuss their findings:

Ed.: Who created the videos you studied: were they created by the public, or were they also produced by perhaps more progressive media outlets? i.e. were you able to disentangle the effect of the media in terms of these narratives?

Mark / Donna: For this study, we studied YouTube videos, using the “relevance” filter. Thus, the videos were ordered by most related to our topic and by most frequently viewed. With this selection method we captured videos produced by a variety of sources; some that contained embedded videos from mainstream media, others created by non-profit groups and public television groups, but also videos produced by interested citizens or private groups. The non-profit and media groups more often discuss the beneficial elements of the border (trade, shared environmental protection, etc.), while individual citizens or groups tended to post the more emotional and narrative-driven videos more likely to construct the border residents in a non-deserving sense.

Ed.: How influential do you think these videos are? In a world of extreme media concentration (where even the US President seems to get his news from Fox headlines and the 42 people he follows on Twitter) .. how significant is “home grown” content; which after all may have better, or at least more locally-representative, information than certain parts of the national media?

Mark / Donna: Today’s extreme media world supplies us with constant and fast-moving news. YouTube is part of the media mix, frequently mentioned as the second largest search engine on the web, and as such is influential. Media sources report that a large number of diverse people use YouTube, thus the videos encompass a broad swath of international, domestic and local issues. That said, as with most news sources today, some individuals gravitate to the stories that represent their point of view, and YouTube makes it possible for individuals to do just this. In other words, if a person perceives the US-Mexico border as a horrible place, they can use key words to search YouTube videos that represent that point of view.

However, we believe YouTube to be more influential than some other sources precisely because it encompasses diversity, thus, even when searching using specific terms, there will likely be a few videos included in search results that provide a different point of view. Furthermore, we did find some local, “home grown” content included in search results, again adding to the diversity presented to the individual watching YouTube. Although, we found less homegrown content than initially expected. Overall, there is selectivity bias with YouTube, like any type of media, but YouTube’s greater diversity of postings and viewers and broad distribution may increase both exposure and influence.

Ed.: Your article was published pre-Trump. How do you think things might have changed post-election, particularly given the uncertainty over “the wall“ and NAFTA — and Trump’s rather strident narratives about each? Is it still a case of “negative traditional media; equivocal social media”?

Mark / Donna: Our guess is that anti-border forces are more prominent on YouTube since Trump’s election and inauguration. Unless there is an organized effort to counter discussion of “the wall” and produce positive constructions of the border, we expect that YouTube videos posted over the past few months lean more toward non-deserving constructions.

Ed.: How significant do you think social media is for news and politics generally, i.e. its influence in this information environment — compared with (say) the mainstream press and party-machines? I guess Trump’s disintermediated tweeting might have turned a few assumptions on their heads, in terms of the relation between news, social media and politics? Or is the media always going to be bigger than Trump / the President?

Mark / Donna: Social media, including YouTube and Twitter, is interactive and thus allows anyone to bypass traditional institutions. President Trump can bypass institutions of government, media institutions, even his own political party and staff and communicate directly with people via Twitter. Of course, there are advantages to that, including hearing views that differ from the “official lines,” but there are also pitfalls, such as minimized editing of comments.

We believe people see both the strengths and the weakness with social media, and thus often read news from both traditional media sources and social media. Traditional media is still powerful and connected to traditional institutions, thus, remains a substantial source of information for many people — although social media numbers are climbing, particularly with the President’s use of Twitter. Overall, both types of media influence politics, although we do not expect future presidents will necessarily emulate President Trump’s use of social media.

Ed.: Another thing we hear a lot about now is “filter bubbles” (and whether or not they’re a thing). YouTube filters viewing suggestions according to what you watch, but still presents a vast range of both good and mad content: how significant do you think YouTube (and the explosion of smartphone video) content is in today’s information / media environment? (And are filter bubbles really a thing..?)

Mark / Donna: Yeah, we think that the filter bubbles are real. Again, we think that social media has a lot of potential to provide new information to people (and still does); although currently social media is falling into the same selectivity bias that characterizes the traditional media. We encourage our students to use online technology to seek out diverse sources; sources that both mirror their opinions and that oppose their opinions. People in the US can access diverse sources on a daily basis, but they have to be willing to seek out perspectives that differ from their own view, perspectives other than their favoured news source.

The key is getting individuals to want to challenge themselves and to be open to cognitive dissonance as they read or watch material that differs from their belief systems. Technology is advanced but humans still suffer the cognitive limitations from which they have always suffered. The political system in the US, and likely other places, encourages it. The key is for individuals to be willing to listen to views unlike their own.

Read the full article: Lybecker, D.L., McBeth, M.K., Husmann, M.A, and Pelikan, N. (2015) Do New Media Support New Policy Narratives? The Social Construction of the U.S.–Mexico Border on YouTube. Policy & Internet 7 (4). DOI: 10.1002/poi3.94.

Mark McBeth and Donna Lybecker were talking to blog editor David Sutcliffe.

Five Pieces You Should Probably Read On: Fake News and Filter Bubbles

David Sutcliffe — Fri, 27 Jan 2017 10:08:39 +0000

This is the second post in a series that will uncover great writing by faculty and students at the Oxford Internet Institute, things you should probably know, and things that deserve to be brought out for another viewing. This week: Fake News and Filter Bubbles!

Fake news, post-truth, “alternative facts”, filter bubbles — this is the news and media environment we apparently now inhabit, and that has formed the fabric and backdrop of Brexit (“£350 million a week”) and Trump (“This was the largest audience to ever witness an inauguration — period”). Do social media divide us, hide us from each other? Are you particularly aware of what content is personalised for you, what it is you’re not seeing? How much can we do with machine-automated or crowd-sourced verification of facts? And are things really any worse now than when Bacon complained in 1620 about the false notions that “are now in possession of the human understanding, and have taken deep root therein”?

1. Bernie Hogan: How Facebook divides us [Times Literary Supplement]

27 October 2016 / 1000 words / 5 minutes

“Filter bubbles can create an increasingly fractured population, such as the one developing in America. For the many people shocked by the result of the British EU referendum, we can also partially blame filter bubbles: Facebook literally filters our friends’ views that are least palatable to us, yielding a doctored account of their personalities.”

Bernie Hogan says it’s time Facebook considered ways to use the information it has about us to bring us together across political, ideological and cultural lines, rather than hide us from each other or push us into polarized and hostile camps. He says it’s not only possible for Facebook to help mitigate the issues of filter bubbles and context collapse; it’s imperative, and it’s surprisingly simple.

2. Luciano Floridi: Fake news and a 400-year-old problem: we need to resolve the ‘post-truth’ crisis [the Guardian]

29 November 2016 / 1000 words / 5 minutes

“The internet age made big promises to us: a new period of hope and opportunity, connection and empathy, expression and democracy. Yet the digital medium has aged badly because we allowed it to grow chaotically and carelessly, lowering our guard against the deterioration and pollution of our infosphere. […] some of the costs of misinformation may be hard to reverse, especially when confidence and trust are undermined. The tech industry can and must do better to ensure the internet meets its potential to support individuals’ wellbeing and social good.”

The Internet echo chamber satiates our appetite for pleasant lies and reassuring falsehoods, and has become the defining challenge of the 21st century, says Luciano Floridi. So far, the strategy for technology companies has been to deal with the ethical impact of their products retrospectively, but this is not good enough, he says. We need to shape and guide the future of the digital, and stop making it up as we go along. It is time to work on an innovative blueprint for a better kind of infosphere.

3. Philip Howard: Facebook and Twitter’s real sin goes beyond spreading fake news

3 January 2017 / 1000 words / 5 minutes

“With the data at their disposal and the platforms they maintain, social media companies could raise standards for civility by refusing to accept ad revenue for placing fake news. They could let others audit and understand the algorithms that determine who sees what on a platform. Just as important, they could be the platforms for doing better opinion, exit and deliberative polling.”

Only Facebook and Twitter know how pervasive fabricated news stories and misinformation campaigns have become during referendums and elections, says Philip Howard — and allowing fake news and computational propaganda to target specific voters is an act against democratic values. But in a time of weakening polling systems, withholding data about public opinion is actually their major crime against democracy, he says.

4. Brent Mittelstadt: Should there be a better accounting of the algorithms that choose our news for us?

7 December 2016 / 1800 words / 8 minutes

“Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished. At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users.”

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information, says Brent Mittelstadt. And content personalization systems and the algorithms they rely upon create a new type of curated media that can undermine the fairness and quality of political discourse.

5. Heather Ford: Verification of crowd-sourced information: is this ‘crowd wisdom’ or machine wisdom?

19 November 2013 / 1400 words / 6 minutes

“A key question being asked in the design of future verification mechanisms is the extent to which verification work should be done by humans or non-humans (machines). Here, verification is not a binary categorisation, but rather there is a spectrum between human and non-human verification work, and indeed, projects like Ushahidi, Wikipedia and Galaxy Zoo have all developed different verification mechanisms.”

‘Human’ verification, a process of checking whether a particular report meets a group’s truth standards, is an acutely social process, says Heather Ford. If code is law and if other aspects in addition to code determine how we can act in the world, it is important that we understand the context in which code is deployed. Verification is a practice that determines how we can trust information coming from a variety of sources — only by illuminating such practices and the variety of impacts that code can have in different environments can we begin to understand how code regulates our actions in crowdsourcing environments.

.. and just to prove we’re capable of understanding and acknowledging and assimilating multiple viewpoints on complex things, here’s Helen Margetts, with a different slant on filter bubbles: “Even if political echo chambers were as efficient as some seem to think, there is little evidence that this is what actually shapes election results. After all, by definition echo chambers preach to the converted. It is the undecided people who (for example) the Leave and Trump campaigns needed to reach. And from the research, it looks like they managed to do just that.”

The Authors

Bernie Hogan is a Research Fellow at the OII; his research interests lie at the intersection of social networks and media convergence.

Luciano Floridi is the OII’s Professor of Philosophy and Ethics of Information. His research areas are the philosophy of Information, information and computer ethics, and the philosophy of technology.

Philip Howard is the OII’s Professor of Internet Studies. He investigates the impact of digital media on political life around the world.

Brent Mittelstadt is an OII Postdoc His research interests include the ethics of information handled by medical ICT, theoretical developments in discourse and virtue ethics, and epistemology of information.

Heather Ford completed her doctorate at the OII, where she studied how Wikipedia editors write history as it happens. She is now a University Academic Fellow in Digital Methods at the University of Leeds. Her forthcoming book “Fact Factories: Wikipedia’s Quest for the Sum of All Human Knowledge” will be published by MIT Press.

Helen Margetts is the OII’s Director, and Professor of Society and the Internet. She specialises in digital era government, politics and public policy, and data science and experimental methods. Her most recent book is Political Turbulence (Princeton).

Coming up! .. It’s the economy, stupid / Augmented reality and ambient fun / The platform economy / Power and development / Internet past and future / Government / Labour rights / The disconnected / Ethics / Staying critical

Should there be a better accounting of the algorithms that choose our news for us?

David Sutcliffe — Wed, 07 Dec 2016 14:44:31 +0000

A central ideal of democracy is that political discourse should allow a fair and critical exchange of ideas and values. But political discourse is unavoidably mediated by the mechanisms and technologies we use to communicate and receive information — and content personalization systems (think search engines, social media feeds and targeted advertising), and the algorithms they rely upon, create a new type of curated media that can undermine the fairness and quality of political discourse.

A new article by Brent Mittlestadt explores the challenges of enforcing a political right to transparency in content personalization systems. Firstly, he explains the value of transparency to political discourse and suggests how content personalization systems undermine open exchange of ideas and evidence among participants: at a minimum, personalization systems can undermine political discourse by curbing the diversity of ideas that participants encounter. Second, he explores work on the detection of discrimination in algorithmic decision making, including techniques of algorithmic auditing that service providers can employ to detect political bias. Third, he identifies several factors that inhibit auditing and thus indicate reasonable limitations on the ethical duties incurred by service providers — content personalization systems can function opaquely and be resistant to auditing because of poor accessibility and interpretability of decision-making frameworks. Finally, Brent concludes with reflections on the need for regulation of content personalization systems.

He notes that no matter how auditing is pursued, standards to detect evidence of political bias in personalized content are urgently required. Methods are needed to routinely and consistently assign political value labels to content delivered by personalization systems. This is perhaps the most pressing area for future work—to develop practical methods for algorithmic auditing.

The right to transparency in political discourse may seem unusual and farfetched. However, standards already set by the U.S. Federal Communication Commission’s fairness doctrine — no longer in force — and the British Broadcasting Corporation’s fairness principle both demonstrate the importance of the idealized version of political discourse described here. Both precedents promote balance in public political discourse by setting standards for delivery of politically relevant content. Whether it is appropriate to hold service providers that use content personalization systems to a similar standard remains a crucial question.

Read the full article: Mittelstadt, B. (2016) Auditing for Transparency in Content Personalization Systems. International Journal of Communication 10(2016), 4991–5002.

We caught up with Brent to explore the broader implications of the study:

Ed: We basically accept that the tabloids will be filled with gross bias, populism and lies (in order to sell copy) — and editorial decisions are not generally transparent to us. In terms of their impact on the democratic process, what is the difference between the editorial boardroom and a personalising social media algorithm?

Brent: There are a number of differences. First, although not necessarily transparent to the public, one hopes that editorial boardrooms are at least transparent to those within the news organisations. Editors can discuss and debate the tone and factual accuracy of their stories, explain their reasoning to one another, reflect upon the impact of their decisions on their readers, and generally have a fair debate about the merits and weaknesses of particular content.

This is not the case for a personalising social media algorithm; those working with the algorithm inside a social media company are often unable to explain why the algorithm is functioning in a particular way, or determined a particular story or topic to be ‘trending’ or displayed to particular users, while others are not. It is also far more difficult to ‘fact check’ algorithmically curated news; a news item can be widely disseminated merely by many users posting or interacting with it, without any purposeful dissemination or fact checking by the platform provider.

Another big difference is the degree to which users can be aware of the bias of the stories they are reading. Whereas a reader of The Daily Mail or The Guardian will have some idea of the values of the paper, the same cannot be said of platforms offering algorithmically curated news and information. The platform can be neutral insofar as it disseminates news items and information reflecting a range of values and political viewpoints. A user will encounter items reflecting her particular values (or, more accurately, her history of interactions with the platform and the values inferred from them), but these values, and their impact on her exposure to alternative viewpoints, may not be apparent to the user.

Ed: And how is content “personalisation” different to content filtering (e.g. as we see with the Great Firewall of China) that people get very worked up about? Should we be more worried about personalisation?

Brent: Personalisation and filtering are essentially the same mechanism; information is tailored to a user or users according to some prevailing criteria. One difference is whether content is merely infeasible to access, or technically inaccessible. Content of all types will typically still be accessible in principle when personalisation is used, but the user will have to make an effort to access content that is not recommended or otherwise given special attention. Filtering systems, in contrast, will impose technical measures to make particular content inaccessible from a particular device or geographical area.

Another difference is the source of the criteria used to set the visibility of different types of content. In the case of personalisation, these criteria are typically based on the users (inferred) interests, values, past behaviours and explicit requests. Critically, these values are not necessarily apparent to the user. For filtering, criteria are typically externally determined by a third party, often a government. Some types of information are set off limits, according to the prevailing values of the third party. It is the imposition of external values, which limit the capacity of users to access content of their choosing, which often causes an outcry against filtering and censorship.

Importantly, the two mechanisms do not necessarily differ in terms of the transparency of the limiting factors or rules to users. In some cases, such as the recently proposed ban in the UK of adult websites that do not provide meaningful age verification mechanisms, the criteria that determine whether sites are off limits will be publicly known at a general level. In other cases, and especially with personalisation, the user inside the ‘filter bubble’ will be unaware of the rules that determine whether content is (in)accessible. And it is not always the case that the platform provider intentionally keeps these rules secret. Rather, the personalisation algorithms and background analytics that determine the rules can be too complex, inaccessible or poorly understood even by the provider to give the user any meaningful insight.

Ed: Where are these algorithms developed: are they basically all proprietary? i.e. how would you gain oversight of massively valuable and commercially sensitive intellectual property?

Brent: Personalisation algorithms tend to be proprietary, and thus are not normally open to public scrutiny in any meaningful sense. In one sense this is understandable; personalisation algorithms are valuable intellectual property. At the same time the lack of transparency is a problem, as personalisation fundamentally affects how users encounter and digest information on any number of topics. As recently argued, it may be the case that personalisation of news impacts on political and democratic processes. Existing regulatory mechanisms have not been successful in opening up the ‘black box’ so to speak.

It can be argued, however, that legal requirements should be adopted to require these algorithms to be open to public scrutiny due to the fundamental way they shape our consumption of news and information. Oversight can take a number of forms. As I argue in the article, algorithmic auditing is one promising route, performed both internally by the companies themselves, and externally by a government agency or researchers. A good starting point would be for the companies developing and deploying these algorithms to extend their cooperation with researchers, thereby allowing a third party to examine the effects these systems are having on political discourse, and society more broadly.

Ed: By “algorithm audit” — do you mean examining the code and inferring what the outcome might be in terms of bias, or checking the outcome (presumably statistically) and inferring that the algorithm must be introducing bias somewhere? And is it even possible to meaningfully audit personalisation algorithms, when they might rely on vast amounts of unpredictable user feedback to train the system?

Brent: Algorithm auditing can mean both of these things, and more. Audit studies are a tool already in use, whereby human participants introduce different inputs into a system, and examine the effect on the system’s outputs. Similar methods have long been used to detect discriminatory hiring practices, for instance. Code audits are another possibility, but are generally prohibitive due to problems of access and complexity. Also, even if you can access and understand the code of an algorithm, that tells you little about how the algorithm performs in practice when given certain input data. Both the algorithm and input data would need to be audited.

Alternatively, auditing can assess just the outputs of the algorithm; recent work to design mechanisms to detect disparate impact and discrimination, particularly in the Fairness, Accountability and Transparency in Machine Learning (FAT-ML) community, is a great example of this type of auditing. Algorithms can also be designed to attempt to prevent or detect discrimination and other harms as they occur. These methods are as much about the operation of the algorithm, as they are about the nature of the training and input data, which may itself be biased. In short, auditing is very difficult, but there are promising avenues of research and development. Once we have reliable auditing methods, the next major challenge will be to tailor them to specific sectors; a one-size-meets-all approach to auditing is not on the cards.

Ed: Do you think this is a real problem for our democracy? And what is the solution if so?

Brent: It’s difficult to say, in part because access and data to study the effects of personalisation systems are hard to come by. It is one thing to prove that personalisation is occurring on a particular platform, or to show that users are systematically displayed content reflecting a narrow range of values or interests. It is quite another to prove that these effects are having an overall harmful effect on democracy. Digesting information is one of the most basic elements of social and political life, so any mechanism that fundamentally changes how information is encountered should be subject to serious and sustained scrutiny.

Assuming personalisation actually harms democracy or political discourse, mitigating its effects is quite a different issue. Transparency is often treated as the solution, but merely opening up algorithms to public and individual scrutiny will not in itself solve the problem. Information about the functionality and effects of personalisation must be meaningful to users if anything is going to be accomplished.

At a minimum, users of personalisation systems should be given more information about their blind spots, about the types of information they are not seeing, or where they lie on the map of values or criteria used by the system to tailor content to users. A promising step would be proactively giving the user some idea of what the system thinks it knows about them, or how they are being classified or profiled, without the user first needing to ask.

Brent Mittelstadt was talking to blog editor David Sutcliffe.

Is Social Media Killing Democracy?

Phil Howard — Tue, 15 Nov 2016 08:46:10 +0000

Donald Trump in Reno, Nevada, by Darron Birgenheier (Flickr).

This is the big year for computational propaganda — using immense data sets to manipulate public opinion over social media. Both the Brexit referendum and US election have revealed the limits of modern democracy, and social media platforms are currently setting those limits.

Platforms like Twitter and Facebook now provide a structure for our political lives. We’ve always relied on many kinds of sources for our political news and information. Family, friends, news organizations, charismatic politicians certainly predate the internet. But whereas those are sources of information, social media now provides the structure for political conversation. And the problem is that these technologies permit too much fake news, encourage our herding instincts, and aren’t expected to provide public goods.

First, social algorithms allow fake news stories from untrustworthy sources to spread like wildfire over networks of family and friends. Many of us just assume that there is a modicum of truth-in-advertising. We expect this from advertisements for commercial goods and services, but not from politicians and political parties. Occasionally a political actor gets punished for betraying the public trust through their misinformation campaigns. But in the United States “political speech” is completely free from reasonable public oversight, and in most other countries the media organizations and public offices for watching politicians are legally constrained, poorly financed, or themselves untrustworthy. Research demonstrates that during the campaigns for Brexit and the U.S. presidency, large volumes of fake news stories, false factoids, and absurd claims were passed over social media networks, often by Twitter’s highly automated accounts and Facebook’s algorithms.

Second, social media algorithms provide very real structure to what political scientists often call “elective affinity” or “selective exposure”. When offered the choice of who to spend time with or which organizations to trust, we prefer to strengthen our ties to the people and organizations we already know and like. When offered a choice of news stories, we prefer to read about the issues we already care about, from pundits and news outlets we’ve enjoyed in the past. Random exposure to content is gone from our diets of news and information. The problem is not that we have constructed our own community silos — humans will always do that. The problem is that social media networks take away the random exposure to new, high-quality information.

This is not a technological problem. We are social beings and so we will naturally look for ways to socialize, and we will use technology to socialize each other. But technology could be part of the solution. A not-so-radical redesign might occasionally expose us to new sources of information, or warn us when our own social networks are getting too bounded.

The third problem is that technology companies, including Facebook and Twitter, have been given a “moral pass” on the obligations we hold journalists and civil society groups to.

In most democracies, the public policy and exit polling systems have been broken for a decade. Many social scientists now find that big data, especially network data, does a better job of revealing public preferences than traditional random digit dial systems. So Facebook actually got a moral pass twice this year. Their data on public opinion would have certainly informed the Brexit debate, and their data on voter preferences would certainly have informed public conversation during the US election.

Facebook has run several experiments now, published in scholarly journals, demonstrating that they have the ability to accurately anticipate and measure social trends. Whereas journalists and social scientists feel an obligation to openly analyze and discuss public preferences, we do not expect this of Facebook. The network effects that clearly were unmeasured by pollsters were almost certainly observable to Facebook. When it comes to news and information about politics, or public preferences on important social questions, Facebook has a moral obligation to share data and prevent computational propaganda. The Brexit referendum and US election have taught us that Twitter and Facebook are now media companies. Their engineering decisions are effectively editorial decisions, and we need to expect more openness about how their algorithms work. And we should expect them to deliberate about their editorial decisions.

There are some ways to fix these problems. Opaque software algorithms shape what people find in their news feeds. We’ve all noticed fake news stories (often called clickbait), and while these can be an entertaining part of using the internet, it is bad when they are used to manipulate public opinion. These algorithms work as “bots” on social media platforms like Twitter, where they were used in both the Brexit and US presidential campaign to aggressively advance the case for leaving Europe and the case for electing Trump. Similar algorithms work behind the scenes on Facebook, where they govern what content from your social networks actually gets your attention.

So the first way to strengthen democratic practices is for academics, journalists, policy makers and the interested public to audit social media algorithms. Was Hillary Clinton really replaced by an alien in the final weeks of the 2016 campaign? We all need to be able to see who wrote this story, whether or not it is true, and how it was spread. Most important, Facebook should not allow such stories to be presented as news, much less spread. If they take ad revenue for promoting political misinformation, they should face the same regulatory punishments that a broadcaster would face for doing such a public disservice.

The second problem is a social one that can be exacerbated by information technologies. This means it can also be mitigated by technologies. Introducing random news stories and ensuring exposure to high quality information would be a simple — and healthy — algorithmic adjustment to social media platforms. The third problem could be resolved with moral leadership from within social media firms, but a little public policy oversight from elections officials and media watchdogs would help. Did Facebook see that journalists and pollsters were wrong about public preferences? Facebook should have told us if so, and shared that data.

Social media platforms have provided a structure for spreading around fake news, we users tend to trust our friends and family, and we don’t hold media technology firms accountable for degrading our public conversations. The next big thing for technology evolution is the Internet of Things, which will generate massive amounts of data that will further harden these structures. Is social media damaging democracy? Yes, but we can also use social media to save democracy.

The life and death of political news: using online data to measure the impact of the audience agenda

Jonathan Bright — Tue, 09 Sep 2014 07:04:47 +0000

Image of the Telegraph’s state of the art “hub and spoke” newsroom layout by David Sim.

The political agenda has always been shaped by what the news media decide to publish — through their ability to broadcast to large, loyal audiences in a sustained manner, news editors have the ability to shape ‘political reality’ by deciding what is important to report. Traditionally, journalists pass to their editors from a pool of potential stories; editors then choose which stories to publish. However, with the increasing importance of online news, editors must now decide not only what to publish and where, but how long it should remain prominent and visible to the audience on the front page of the news website.

The question of how much influence the audience has in these decisions has always been ambiguous. While in theory we might expect journalists to be attentive to readers, journalism has also been characterized as a profession with a “deliberate…ignorance of audience wants” (Anderson, 2011b). This ‘anti-populism’ is still often portrayed as an important journalistic virtue, in the context of telling people what they need to hear, rather than what they want to hear. Recently, however, attention has been turning to the potential impact that online audience metrics are having on journalism’s “deliberate ignorance”. Online publishing provides a huge amount of information to editors about visitor numbers, visit frequency, and what visitors choose to read and how long they spend reading it. Online editors now have detailed information about what articles are popular almost as soon as they are published, with these statistics frequently displayed prominently in the newsroom.

The rise of audience metrics has created concern both within the journalistic profession and academia, as part of a broader set of concerns about the way journalism is changing online. Many have expressed concern about a ‘culture of click’, whereby important but unexciting stories make way for more attention grabbing pieces, and editorial judgments are overridden by traffic statistics. At a time when media business models are under great strain, the incentives to follow the audience are obvious, particularly when business models increasingly rely on revenue from online traffic and advertising. The consequences for the broader agenda-setting function of the news media could be significant: more prolific or earlier readers might play a disproportionate role in helping to select content; particular social classes or groupings that read news online less frequently might find their issues being subtly shifted down the agenda.

The extent to which such a populist influence exists has attracted little empirical research. Many ethnographic studies have shown that audience metrics are being captured in online newsrooms, with anecdotal evidence for the importance of traffic statistics on an article’s lifetime (Anderson 2011b, MacGregor, 2007). However, many editors have emphasised that popularity is not a major determining factor (MacGregor, 2007), and that news values remain significant in terms of placement of news articles.

In order to assess the possible influence of audience metrics on decisions made by political news editors, we undertook a systematic, large-scale study of the relationship between readership statistics and article lifetime. We examined the news cycles of five major UK news outlets (the BBC, the Daily Telegraph, the Guardian, the Daily Mail and the Mirror) over a period of six weeks, capturing their front pages every 15 minutes, resulting in over 20,000 front-page captures and more than 40,000 individual articles. We measure article readership by capturing information from the BBC’s “most read” list of news articles (twelve percent of the articles were featured at some point on the ‘most read’ list, with a median time to achieving this status of two hours, and an average article life of 15 hours on the front page). Using the Cox Proportional Hazards model (which allows us to quantify the impact of an article’s appearance on the ‘most read’ list on its chance of survival) we asked whether an article’s being listed in a ‘most read’ column affected the length of time it remained on the front page.

We found that ‘most read’ articles had, on average, a 26% lower chance of being removed from the front page than equivalent articles which were not on the most read list, providing support for the idea that online editors are influenced by readership statistics. In addition to assessing the general impact of readership statistics, we also wanted to see whether this effect differs between ‘political’ and ‘entertainment’ news. Research on participatory journalism has suggested that online editors might be more willing to allow audience participation in areas of soft news such as entertainment, arts, sports, etc. We find a small amount of evidence for this claim, though the difference between the two categories was very slight.

Finally, we wanted to assess whether there is a ‘quality’ / ‘tabloid’ split. Part of the definition of tabloid style journalism lies precisely in its willingness to follow the demands of its audience. However, we found the audience ‘effect’ (surprisingly) to be most obvious in the quality papers. For tabloids, ‘most read’ status actually had a slightly negative effect on article lifetime. We wouldn’t argue that tabloid editors actively reject the wishes of their audience; however we can say that these editors are no more likely to follow their audience than the typical ‘quality’ editor, and in fact may be less so. We do not have a clear explanation for this difference, though we could speculate that, as tabloid publications are already more tuned in to the wishes of their audience, the appearance of readership statistics makes less practical difference to the overall product. However it may also simply be the case that the online environment is slowly producing new journalistic practices for which the tabloid / quality distinction will be of less usefulness.

So on the basis of our study, we can say that high-traffic articles do in fact spend longer in the spotlight than ones that attract less readership: audience readership does have a measurable impact on the lifespan of political news. The audience is no longer the unknown quantity it was in offline journalism: it appears to have a clear impact on journalistic practice. The question that remains, however, is whether this constitutes evidence of a new ‘populism’ in journalism; or whether it represents (as editors themselves have argued) the simple striking of a balance between audience demands and news values.

Read the full article: Bright, J., and Nicholls, T. (2014) The Life and Death of Political News: Measuring the Impact of the Audience Agenda Using Online Data. Social Science Computer Review 32 (2) 170-181.

References

Anderson, C. W. (2011) Between creative and quantified audiences: Web metrics and changing patterns of newswork in local US newsrooms. Journalism 12 (5) 550-566.

MacGregor, P. (2007) Tracking the Online Audience. Journalism Studies 8 (2) 280-298.

OII Resarch Fellow Jonathan Bright is a political scientist specialising in computational and ‘big data’ approaches to the social sciences. His major interest concerns studying how people get information about the political process, and how this is changing in the internet era.

Tom Nicholls is a doctoral student at the Oxford Internet Institute. His research interests include the impact of technology on citizen/government relationships, the Internet’s implications for public management and models of electronic public service delivery.

Mapping collective public opinion in the Russian blogosphere

Olessia Koltsova — Mon, 10 Feb 2014 11:30:05 +0000

Widely reported as fraudulent, the 2011 Russian Parliamentary elections provoked mass street protest action by tens of thousands of people in Moscow and cities and towns across Russia. Image by Nikolai Vassiliev.

Blogs are becoming increasingly important for agenda setting and formation of collective public opinion on a wide range of issues. In countries like Russia where the Internet is not technically filtered, but where the traditional media is tightly controlled by the state, they may be particularly important. The Russian language blogosphere counts about 85 million blogs – an amount far beyond the capacities of any government to control – and the Russian search engine Yandex, with its blog rating service, serves as an important reference point for Russia’s educated public in its search of authoritative and independent sources of information. The blogosphere is thereby able to function as a mass medium of “public opinion” and also to exercise influence.

One topic that was particularly salient over the period we studied concerned the Russian Parliamentary elections of December 2011. Widely reported as fraudulent, they provoked immediate and mass street protest action by tens of thousands of people in Moscow and cities and towns across Russia, as well as corresponding activity in the blogosphere. Protesters made effective use of the Internet to organize a movement that demanded cancellation of the parliamentary election results, and the holding of new and fair elections. These protests continued until the following summer, gaining widespread national and international attention.

Most of the political and social discussion blogged in Russia is hosted on the blog platform LiveJournal. Some of these bloggers can claim a certain amount of influence; the top thirty bloggers have over 20,000 “friends” each, representing a good circulation for the average Russian newspaper. Part of the blogosphere may thereby resemble the traditional media; the deeper into the long tail of average bloggers, however, the more it functions as more as pure public opinion. This “top list” effect may be particularly important in societies (like Russia’s) where popularity lists exert a visible influence on bloggers’ competitive behavior and on public perceptions of their significance. Given the influence of these top bloggers, it may be claimed that, like the traditional media, they act as filters of issues to be thought about, and as definers of their relative importance and salience.

Gauging public opinion is of obvious interest to governments and politicians, and opinion polls are widely used to do this, but they have been consistently criticized for the imposition of agendas on respondents by pollsters, producing artefacts. Indeed, the public opinion literature has tended to regard opinion as something to be “extracted” by pollsters, which inevitably pre-structures the output. This literature doesn’t consider that public opinion might also exist in the form of natural language texts, such as blog posts, that have not been pre-structured by external observers.

There are two basic ways to detect topics in natural language texts: the first is manual coding of texts (ie by traditional content analysis), and the other involves rapidly developing techniques of automatic topic modeling or text clustering. The media studies literature has relied heavily on traditional content analysis; however, these studies are inevitably limited by the volume of data a person can physically process, given there may be hundreds of issues and opinions to track — LiveJournal’s 2.8 million blog accounts, for example, generate 90,000 posts daily.

For large text collections, therefore, only the second approach is feasible. In our article we explored how methods for topic modeling developed in computer science may be applied to social science questions – such as how to efficiently track public opinion on particular (and evolving) issues across entire populations. Specifically, we demonstrate how automated topic modeling can identify public agendas, their composition, structure, the relative salience of different topics, and their evolution over time without prior knowledge of the issues being discussed and written about. This automated “discovery” of issues in texts involves division of texts into topically — or more precisely, lexically — similar groups that can later be interpreted and labeled by researchers. Although this approach has limitations in tackling subtle meanings and links, experiments where automated results have been checked against human coding show over 90 percent accuracy.

The computer science literature is flooded with methodological papers on automatic analysis of big textual data. While these methods can’t entirely replace manual work with texts, they can help reduce it to the most meaningful and representative areas of the textual space they help to map, and are the only means to monitor agendas and attitudes across multiple sources, over long periods and at scale. They can also help solve problems of insufficient and biased sampling, when entire populations become available for analysis. Due to their recentness, as well as their mathematical and computational complexity, these approaches are rarely applied by social scientists, and to our knowledge, topic modeling has not previously been applied for the extraction of agendas from blogs in any social science research.

The natural extension of automated topic or issue extraction involves sentiment mining and analysis; as Gonzalez-Bailon, Kaltenbrunner, and Banches (2012) have pointed out, public opinion doesn’t just involve specific issues, but also encompasses the state of public emotion about these issues, including attitudes and preferences. This involves extracting opinions on the issues/agendas that are thought to be present in the texts, usually by dividing sentences into positive and negative. These techniques are based on human-coded dictionaries of emotive words, on algorithmic construction of sentiment dictionaries, or on machine learning techniques.

Both topic modeling and sentiment analysis techniques are required to effectively monitor self-generated public opinion. When methods for tracking attitudes complement methods to build topic structures, a rich and powerful map of self-generated public opinion can be drawn. Of course this mapping can’t completely replace opinion polls; rather, it’s a new way of learning what people are thinking and talking about; a method that makes the vast amounts of user-generated content about society – such as the 65 million blogs that make up the Russian blogosphere — available for social and policy analysis.

Naturally, this approach to public opinion and attitudes is not free of limitations. First, the dataset is only representative of the self-selected population of those who have authored the texts, not of the whole population. Second, like regular polled public opinion, online public opinion only covers those attitudes that bloggers are willing to share in public. Furthermore, there is still a long way to go before the relevant instruments become mature, and this will demand the efforts of the whole research community: computer scientists and social scientists alike.

Read the full paper: Olessia Koltsova and Sergei Koltcov (2013) Mapping the public agenda with topic modeling: The case of the Russian livejournal. Policy and Internet 5 (2) 207–227.

Also read on this blog: Can text mining help handle the data deluge in public policy analysis? by Aude Bicquelet.

References

González-Bailón, S., A. Kaltenbrunner, and R.E. Banches. 2012. “Emotions, Public Opinion and U.S. Presidential Approval Rates: A 5 Year Analysis of Online Political Discussions,” Human Communication Research 38 (2): 121–43.

Is China shaping the Internet in Africa?

Iginio Gagliardone — Thu, 15 Aug 2013 14:02:29 +0000

The telecommunication sector in Africa is increasingly crowded. Image of the Panel on the Future of China-Africa Relations, World Economic Forum on Africa 2011 (Cape Town) by World Economic Forum.

Ed: Concerns have been expressed (eg by Hillary Clinton and David Cameron) about the detrimental role China may play in African media sectors, by increasing authoritarianism and undermining Western efforts to promote openness and freedom of expression. Are these concerns fair?

Iginio: China’s initiatives in the communication sector abroad are burdened by the negative record of its domestic media. For the Chinese authorities this is a challenge that does not have an easy solution as they can’t really use their international broadcasters to tell a different story about Chinese media and Chinese engagement with foreign media, because they won’t be trusted. As the linguist George Lakoff has explained, if someone is told “Don’t think of an elephant!” he will likely start “summoning the bulkiness, the grayness, the trunkiness of an elephant”. That is to say, “when we negate a frame, we evoke a frame”. Saying that “Chinese interventions are not increasing authoritarianism” won’t help much. The only path China can undertake is to develop projects and use its media in ways that fall outside the realm of what is expected, creating new associations between China and the media, rather than trying to redress existing ones. In part this is already happening. For example, CCTV Africa, the new initiative of state-owned China’s Central Television (CCTV) and China’s flagship effort to win African hearts and minds, has developed a strategy aimed not at directly offering an alternative image of China, but at advancing new ways of looking at Africa, offering unprecedented resources to African journalists to report from the continent and tapping into the narrative of a “rising Africa”, as a continent of opportunities rather than of hunger, wars and underdevelopment.

Ed: Ideology has disappeared from the language of China-Africa cooperation, largely replaced by admissions of China’s interest in Africa’s resources and untapped potential. Does politics (eg China wanting to increase its international support and influence) nevertheless still inform the relationship?

China’s efforts in Africa during decolonisation were closely linked to its efforts to export and strengthen the socialist revolution on the continent. Today the language of ideology has largely disappeared from public statements, leaving less charged references to the promotion of “mutual benefit” and “sovereignty and independence” as guides of the new engagement. At the same time, this does not mean that the Chinese government has lost interest in engaging at the political/ideological level when the conditions allow. Identity of political views is not a precondition for engagement anymore but neither is it an aspiration, as China is not necessarily trying to influence local politics in ways that could promote socialism. But when there is already a resonance with the ideas embraced by its partners, Chinese authorities have not shied away from taking the engagement to a political/ideological level. This is demonstrated for example by party to party ties between the Communist Party of China (CUC) and other Socialist parties in Africa, including the Ethiopian People’s Revolutionary Democratic Front. Representative of the CUC have been invited to attend the EPRDF’s party conferences.

Ed: How much influence does China have on the domestic media / IT policies of the nations it invests in? Is it pushing the diffusion of its own strategies of media development and media control abroad? (And what are these strategies if so?)

Iginio: The Chinese government has signalled its lack of interest in exporting its own development model, and its intention to simply respond to the demands of its African partners. Ongoing research has largely confirmed that this ‘no strings attached’ approach is consistent, but this does not mean that China’s presence on the continent is neutral or has no impact on development policies and practices. China is indirectly influencing media/IT policies and practices in at least three ways.

First, while Western donors have tended to favour media projects benefiting the private sector and the civil society, often seeking to create incentives for the state to open a dialogue with other forces in society, China has exhibited a tendency to privilege government actors, thus increasing governments’ capacity vis-à-vis other critical components in the development of a media and telecommunication systems.

Second, with the launch of media projects such as CCTV Africa China has dramatically boosted its potential to shape narratives, exert soft power, and allow different voices to shape the political and development agenda. While international broadcasters such as the BBC World Service and Aljazeera have often tended to rely on civil society organisations as gatekeepers of information, CCTV has so far shown less interest in these actors, privileging the formal over the informal and also as part of its effort to provide more positive news from the continent.

Third, China’s domestic example to balance between investment in media and telecommunication and efforts to contain the risks of political instability that new technologies may bring, has the potential to act as a legitimising force for other states that share concerns of balancing both development and security, and that are actively seeking justifications for limiting voices and uses of technology that are considered potentially destabilising.

Ed: Is China developing tailored media models for abroad, or even using Africa as a “development lab”? How does China’s interest in Africa’s mediascape compare with its interest in other regions worldwide?

Iginio: There are concerns that, just as Western countries have tried to promote their models in Africa, China will try to export its own. As mentioned earlier, no studies to date have proved this to be the case. Rather, Africa indeed seems to be emerging as a “development lab”, a terrain in which to experiment and progressively find new strategies for engagement. Despite Africa’s growing importance for China as a trading and geostrategic partner, the continent is still perceived as a space where it is possible to make mistakes. In the case of the media, this is resulting in greater opportunities for journalists to experiment with new styles and enjoy freedoms that would be more difficult to obtain back in China, or even in the US, where CCTV has launched another regional initiative, CCTV America, which is more burdened, however, by the ideological confrontation between the two countries.

As part of Oxford’s Programme in Comparative Media Law and Policy‘s (PCMLP’s) ongoing research on China’s role in the media and communication sector in Africa, we have proposed a framework that can encourage understanding of Chinese engagement in the African mediasphere in terms of its original contributions, and not simply as a negative of the impression left by the West. This framework breaks down China’s actions on the continent according to China’s ability to act as a partner, a prototype, and a persuader, questioning, for example, whether or not media projects sponsored by the Chinese government are facilitating the diffusion of some aspects that characterise the Chinese domestic media system, rather than assuming this will be the case.

China’s role as a partner is evident in the significant resources it provides to African countries to implement social and economic development projects, including the laying down of infrastructure to increase Internet and mobile access. China’s perception as a prototype is linked to the ability its government has shown in balancing between investment in media and ICTs and containment of the risks of political instability new technologies may bring. Finally, China’s presence in Africa can be assessed according to its modality and ability to act as a persuader, as it seeks to shape national and international narratives.

So far we have employed this framework only to look at Chinese engagement in Africa, focusing in particular on Ghana, Ethiopia and Kenya, but we believe it can be applied also in other areas where China has stepped up its involvement in the ICT sector.

Ed: Has there been any explicit conflict yet between Chinese and non-Chinese news corporations vying for influence in this space? And how crowded is that space?

Iginio: The telecommunication sector in Africa is increasingly crowded as numerous international corporations from Europe (e.g. Vodafone), India (e.g. Airtel) and indeed China (e.g. Huawei and ZTE) are competing for shares of a profitable and growing market. Until recently Chinese companies have avoided competing with one another, but things are slowly changing. In Ethiopia, for example, after an initial project funded by the Chinese government to upgrade the telecommunication infrastructure was entirely commissioned to Chinese telecom giant ZTE, which is partially owned by the state, now ZTE has entered in competition with its Chinese (and privately owned) rival, Huawei, to benefit from an extension of the earlier project. In Kenya Huawei even decided to take ZTE to court over a project its rival won to supply the Kenyan police with a communication and surveillance system. Chinese investments in the telecommunication sectors in Africa have been part of the government’s strategy of engagement in the continent, but profit seems to have become an increasingly important factor, even if this may interfere with this strategy.

Ed: How do the recipient nations regard China’s investment and influence? For example, is there any evidence that authoritarian governments are seeking to adopt aspects of China’s own system?

Iginio: China is perceived as an example mostly by those countries that are seeking to balance between investment in ICTs and containment of the risks of political instability new technologies may bring. In a Wikileaks cable reporting a meeting between Sebhat Nega, one of the Ethiopian government’s ideologues, and the then US ambassador Donald Yamamoto, for example, Sebhat was reported to have openly declared his admiration for China and stressed that Ethiopia “needs the China model to inform the Ethiopian people”.

Iginio Gagliardone is a British Academy Post-Doctoral Research Fellow at the Centre for Socio-Legal Studies, University of Oxford. His research focuses on the role of the media in political change, especially in Sub-Saharan Africa, and the adaptation of international norms of freedom of expression in authoritarian regimes. Currently, he is exploring the role of emerging powers such as China in promoting alternative conceptions of the Internet in Africa. In particular he is analysing whether and how the ideas of state stability, development and community that characterize the Chinese model are influencing and legitimizing the development of a different conception of the information society.

Iginio Gagliardone was talking to blog editor David Sutcliffe.

Uncovering the patterns and practice of censorship in Chinese news sites

Sonya Song — Thu, 08 Aug 2013 08:17:55 +0000

Ed: How much work has been done on censorship of online news in China? What are the methodological challenges and important questions associated with this line of enquiry?

Sonya: Recent research is paying much attention to social media and aiming to quantify their censorial practices and to discern common patterns in them. Among these empirical studies, Bamman et al.’s (2012) work claimed to be “the first large-scale analysis of political content censorship” that investigates messages deleted from Sina Weibo, a Chinese equivalent to Twitter. On an even larger scale, King et al. (2013) collected data from nearly 1,400 Chinese social media platforms and analyzed the deleted messages. Most studies on news censorship, however, are devoted to narratives of special cases, such as the closure of Freeing Point, an outspoken news and opinion journal, and the blocking of the New York Times after it disclosed the wealth possessed by the family of Chinese former premier Wen Jiabao.

The shortage of news censorship research could be attributed to several methodological challenges. First, it is tricky to detect censorship to begin with, given the word ‘censorship’ is one of the first to be censored. Also, news websites will not simply let their readers hit a glaring “404 page not found”. Instead, they will use a “soft 404”, which returns a “success” code for a request of a deleted web page and takes readers to a (different) existing web page. While humans may be able to detect these soft 404s, it will be harder for computer programs (eg run by researchers) to do so. Moreover, because different websites employ varying soft 404 techniques, much labor is required to survey them and to incorporate the acquired knowledge into a generic monitoring tool.

Second, high computing power and bandwidth are required to handle the large amount of news publications and the slow network access to Chinese websites. For instance, NetEase alone publishes 8,000 – 10,000 news articles every day. Meanwhile, the Internet connection between the Chinese cyberspace and the outer world is fairly slow and it takes more than a second to check one link because the Great Firewall checks both incoming and outgoing Internet traffic. These two factors translate to 2-3 hours for a single program to check one day’s news publications of NetEase alone. If we fire up too many programs to accelerate the progress, the database system and/or the network connection may be challenged. In my case, even though I am using high performance computers at Michigan State University to conduct this research, they are overwhelmed every now and then.

Despite all the difficulties, I believe it is of great importance to reveal censored news stories to the public, especially to the audience inside China who do not enjoy a free flow of information. Censored news is a special type of information, as it is too inconvenient to exist in authorities’ eyes and it is deemed important to citizens’ everyday lives. For example, the outbreak of SARS had been censored from Chinese media presumably to avoid spoiling the harmonious atmosphere created for the 16th National Congress of the Communist Party. This allowed the virus to develop into a worldwide epidemic. Like SARS, a variety of censored issues are not only inconvenient but also crucial, because the authorities would not otherwise allocate substantial resources to monitor or eliminate them if they were merely trivial. Therefore, after censored news is detected, it is vital to seek effective and efficient channels to disclose it to the public so as to counterbalance potential damage that censorship may entail.

Ed: You found that party organs, ie news organizations tightly affiliated with the Chinese Communist Party, published a considerable amount of deleted news. Was this surprising?

Sonya: Yes, I was surprised when looking at the results the first time. To be exact, our finding is that commercial media experience a higher deletion rate, but party organs contribute the most deleted news by sheer volume, reflecting the fact that party organs possess more resources allocated by the central and local governments and therefore have the capacity to produce more news. Consequently, party organs have a higher chance of publishing controversial information that may be deleted in the future, especially when a news story becomes sensitive for some reason that is hard to foresee. For example, investigations of some government officials started when netizens recognized them in the news with different luxury watches and other expensive accessories. As such, even though party organs are obliged to write odes to the party, they may eventually backfire on the cadres if the beautiful words are discovered to be too far from reality.

Ed: How sensitive are citizens to the fact that some topics are actively avoided in the news media? And how easy is it for people to keep abreast of these topics (eg the “three Ts” of Tibet, Taiwan, and Tiananmen) from other information sources?

Sonya: This question highlights the distinction between pre-censorship and post-censorship. Our study looked at post-censorship, ie information that is published but subsequently deleted. By contrast, the topics that are “actively avoided” fall under the category of pre-censorship. I am fairly convinced that the current pre- and post-censorship practice is effective in terms of keeping the public from learning inconvenient facts and from mobilizing for collective action. If certain topics are consistently wiped from the mass media, how will citizens ever get to know about them?

The Tiananmen Square protest, for instance, has never been covered by Chinese mass media, leaving an entire generation growing up since 1989 that is ignorant of this historical event. As such, if younger Chinese citizens have never heard of the Tiananmen Square protest, how could they possibly start an inquiry into this incident? Or, if they have heard of it and attempt to learn about it from the Internet, what they will soon realize is that domestic search engines, social media, and news media all fail their requests and foreign ones are blocked. Certainly, they could use circumvention tools to bypass the Great Firewall, but the sad truth is that probably under 1% of them have ever made such an effort, according to the Harvard Berkman Center’s report in 2011.

Ed: Is censorship of domestic news (such as food scares) more geared towards “avoiding panics and maintaining social order”, or just avoiding political embarrassment? For example, do you see censorship of environmental issues and (avoidable) disasters?

Sonya: The government certainly tries to avoid political embarrassment in the case of food scares by manipulating news coverage, but it is also their priority to maintain social order or so-called “social harmony”. Exactly for this reason, Zhao Lianhai, the most outspoken parent of a toxic milk powder victim was charged with “inciting social disorder” and sentenced to two and a half years in prison. Frustrated by Chinese milk powder, Chinese tourists are aggressively stocking up on milk powder from elsewhere, such as in Hong Kong and New Zealand, causing panics over milk powder shortages in those places.

After the earthquake in Sichuan, another group of grieving parents were arrested on similar charges when they questioned why their children were buried under crumbled schools whereas older buildings remained standing. The high death toll of this earthquake was among the avoidable disasters that the government attempts to mask and force the public to forget. Environmental issues, along with land acquisition, social unrest, and labor exploitation, are other frequently censored topics in the name of “stability maintenance”.

Ed: You plotted a map to show the geographic distribution of news deletion: what does the pattern show?

Sonya: We see an apparent geographic pattern in news deletion, with neighboring countries being more likely to be deleted than distant ones. Border disputes between China and its neighbors may be one cause; for example with Japan over the Diaoyu-Senkaku Islands, with the Philippines over the Huangyan Island-Scarborough Shoal, and with India over South Tibet. Another reason may be a concern over maintaining allies. Burma had the highest deletion rates among all the countries, with the deleted news mostly covering its curb on censorship. Watching this shift, China might worry that media reform in Burma could lead to copycat attempts inside China.

On the other hand, China has given Burma diplomatic cover, considering it as the “second coast” to the Indian Ocean and importing its natural resources (Howe & Knight, 2012). For these reasons, China may be compelled to censor Burma more than other countries, even though they don’t share a border. Nonetheless, although oceans apart, the US topped the list by sheer number of news deletions, reflecting the bittersweet relation between the two nations.

Ed: What do you think explains the much higher levels of censorship reported by others for social media than for news media? How does geographic distribution of deletion differ between the two?

Sonya: The deletion rates of online news are apparently far lower than those of Sina Weibo posts. The overall deletion rates on NetEase and Sina Beijing were 0.05% and 0.17%, compared to 16.25% on the social media platform (Bamman et al., 2012). Several reasons may help explain this gap. First, social media confronts enduring spam that has to be cleaned up constantly, whereas it is not a problem at all for professional news aggregators. Second, self-censorship practiced by news media plays an important role, because Chinese journalists are more obliged and prepared to self-censor sensitive information, compared to ordinary Chinese citizens. Subsequently, news media rarely mention “crown prince party” or “democracy movement”, which were among the most frequently deleted terms on Sina Weibo.

Geographically, the deletion rates across China have distinct patterns on news media and social media. Regarding Sina Weibo, deletion rates increase when the messages are published near the fringe or in the west where the economy is less developed. Regarding news websites, the deletion rates rise as they approach the center and east, where the economy is better developed. In addition, the provinces surrounding Beijing also have more news deleted, meaning that political concerns are a driving force behind content control.

Ed: Can you tell if the censorship process mostly relies on searching for sensitive keywords, or on more semantic analysis of the actual content? ie can you (or the censors..) distinguish sensitive “opinions” as well as sensitive topics?

Sonya: First, too sensitive topics will never survive pre-censorship or be published on news websites, such as the Tiananmen Square protest, although they may sneak in on social media with deliberate typos or other circumvention techniques. However, it is clear that censors use keywords to locate articles on sensitive topics. For instance, after the Fukushima earthquake in 2011, rumors spread in the Chinese Cyberspace that radiation was rising from the Japanese nuclear plant and iodine would help protect against its harmful effects; this was followed by panic-buying of iodized salt. During this period, “nuclear defense”, “iodized salt” and “radioactive iodine”–among other generally neutral terms–became politically charged overnight, and were censored in the Chinese web sphere. The taboo list of post-censorship keywords evolves continuously to handle breaking news. Beyond keywords, party organs and other online media are trying to automate sentiment analysis and discern more subtle context. People’s Daily, for instance, has been working with elite Chinese universities in this field and already developed a generic product for other institutes to monitor “public sentiment”.

Another way to sort out sensitive information is to keep an eye on most popular stories, because a popular story would represent a greater “threat” to the existing political and social order. In our study, about 47% of the deleted stories were listed as top 100 mostly read/discussed at some point. This indicates that the more readership a story gains, the more attention it draws from censors.

Although news websites self-censor (therefore experiencing under 1% post-censorship), they are also required to monitor and “clean” comments following each news article. According to my very conservative estimate–if a censor processes 100 comments per minute and works eight hours per day–reviewing comments on Sina Beijing from 11-16 September 2012, would have required 336 censors working full time. In fact, Charles Cao, CEO of Sina, mentioned to Forbes that at least 100 censors were “devoted to monitoring content 24 hours a day”. As new sensitive issues emerge and new circumvention techniques are developed continuously, it is an ongoing battle between the collective intelligence of Chinese netizens and the mechanical work conducted (and artificial intelligence implemented) by a small group of censors.

Ed: It must be a cause of considerable anxiety for journalists and editors to have their material removed. Does censorship lead to sanctions? Or is the censorship more of an annoyance that must be negotiated?

Sonya: Censorship does indeed lead to sanctions. However, I don’t think “anxiety” would be the right word to describe their feelings, because if they are really anxious they could always choose self-censorship and avoid embarrassing the authorities. Considering it is fairly easy to predict whether a news report will please or irritate officials, I believe what fulfills the whistleblowers when they disclose inconvenient facts is a strong sense of justice and tremendous audacity. Moreover, I could barely discern any “negotiation” in the process of censorship. Negotiation is at least a two-way communication, whereas censorship follows continual orders sent from the authorities to the mass media, and similarly propaganda is a one-way communication from the authorities to the masses via the media. As such, it is common to see disobedient journalists threatened or punished for “defying” censorial orders.

Southern Metropolis Daily is one of China’s most aggressive and punished newspapers. In 2003, the newspaper broke the epidemic of SARS that local officials had wished to hide from the public. Soon after this report, it covered a university graduate beaten to death in policy custody because he carried no proper residency papers. Both cases received enormous attention from Chinese authorities and the international community, seriously embarrassing local officials. It is alleged and widely believed that some local officials demanded harsh penalties for the Daily; the director and the deputy editor were sentenced to 11 and 12 years in jail for “taking briberies” and “misappropriating state-owned assets” and the chief editor was dismissed.

Not only professional journalists but also (broadly defined) citizen journalists could face similar penalties. For instance, Xu Zhiyong, a lawyer who defended journalists on trial, and Ai Weiwei, an artist who tried to investigate collapsed schools after the Sichuan earthquake, have experienced similar penalties: fines for tax evasion, physical attacks, house arrest, and secret detainment; exactly the same censorship tactics that states carried out before the advent of the Internet, as described in Ilan Peleg’s (1993) book Patterns of Censorship Around the World.

Ed: What do you think explains the lack of censorship in the overseas portal? (Could there be a certain value for the government in having some news items accessible to an external audience, but unavailable to the internal one?)

Sonya: It is more costly to control content by searching for and deleting individual news stories than simply blocking a whole website. For this reason, when a website outside the Great Firewall carries embarrassing content to the Chinese government, Chinese censors will simply block the whole website rather than request deletions. Overseas branches of Chinese media may comply but foreign media may simply drop such a deletion request.

Given online users’ behavior, it is effective and efficient to strictly control domestic content. In general, there are two types of Chinese online users, those who only visit Chinese websites operating inside China and those who also consume content from outside the country. Regarding this second type, it is really hard to prescribe what they do and don’t read, because they may be well equipped with circumvention tools and often obtain access to Chinese media published in Hong Kong and Taiwan but blocked in China. In addition, some Western media, such as the BBC, the New York Times, and Deutsche Welle, make media consumption easy for Chinese readers by publishing in Chinese. Of course, this type of Chinese user may be well educated and able to read English and other foreign languages directly. Facing these people, Chinese authorities would see their efforts in vain if they tried to censor overseas branches of Chinese media, because, outside the Great Firewall, there are too many sources for information that lie beyond the reach of Chinese censors.

Chinese authorities are in fact strategically wise in putting their efforts into controlling domestic online media, because this first type of Chinese user accounts for 99.9% of the whole online population, according to Google’s 2010 estimate. In his 2013 book Rewire, Ethan Zuckerman summarizes this phenomenon: “none of the top ten nations [in terms of online population] looks at more than 7 percent international content in its fifty most popular news sites” (p. 56). Since the majority of the Chinese populace perceives the domestic Internet as “the entire cyberspace”, manipulating the content published inside the Great Firewall means that (according to Chinese censors) many of the time bombs will have been defused.

Read the full paper: Sonya Yan Song, Fei Shen, Mike Z. Yao, Steven S. Wildman (2013) Unmasking News in Cyberspace: Examining Censorship Patterns of News Portal Sites in China. Presented at “China and the New Internet World”, International Communication Association (ICA) Preconference, Oxford Internet Institute, University of Oxford, June 2013.

Sonya Y. Song led this study as a Google Policy Fellow in 2012. Currently, she is a Knight-Mozilla OpenNews Fellow and a Ph.D. candidate in media and information studies at Michigan State University. Sonya holds a bachelor’s and master’s degree in computer science from Tsinghua University in Beijing and master of philosophy in journalism from the University of Hong Kong. She is also an avid photographer, a devotee of literature, and a film buff.

Sonya Yan Song was talking to blog editor David Sutcliffe.