crowd sourcing – The Policy and Internet Blog https://ensr.oii.ox.ac.uk Understanding public policy online Mon, 07 Dec 2020 14:25:48 +0000 en-GB hourly 1 Can “We the People” really help draft a national constitution? (sort of..) https://ensr.oii.ox.ac.uk/can-we-the-people-really-help-draft-a-national-constitution-sort-of/ Thu, 16 Aug 2018 14:26:18 +0000 http://blogs.oii.ox.ac.uk/policy/?p=4687 As innovations like social media and open government initiatives have become an integral part of the politics in the twenty-first century, there is increasing interest in the possibility of citizens directly participating in the drafting of legislation. Indeed, there is a clear trend of greater public participation in the process of constitution making, and with the growth of e-democracy tools, this trend is likely to continue. However, this view is certainly not universally held, and a number of recent studies have been much more skeptical about the value of public participation, questioning whether it has any real impact on the text of a constitution.

Following the banking crisis, and a groundswell of popular opposition to the existing political system in 2009, the people of Iceland embarked on a unique process of constitutional reform. Having opened the entire drafting process to public input and scrutiny, these efforts culminated in Iceland’s 2011 draft crowdsourced constitution: reputedly the world’s first. In his Policy & Internet article “When Does Public Participation Make a Difference? Evidence From Iceland’s Crowdsourced Constitution”, Alexander Hudson examines the impact that the Icelandic public had on the development of the draft constitution. He finds that almost 10 percent of the written proposals submitted generated a change in the draft text, particularly in the area of rights.

This remarkably high number is likely explained by the isolation of the drafters from both political parties and special interests, making them more reliant on and open to input from the public. However, although this would appear to be an example of successful public crowdsourcing, the new constitution was ultimately rejected by parliament. Iceland’s experiment with participatory drafting therefore demonstrates the possibility of successful online public engagement — but also the need to connect the masses with the political elites. It was the disconnect between these groups that triggered the initial protests and constitutional reform, but also that led to its ultimate failure.

We caught up with Alexander to discuss his findings.

Ed: We know from Wikipedia (and other studies) that group decisions are better, and crowds can be trusted. However, I guess (re US, UK) I also feel increasingly nervous about the idea of “the public” having a say over anything important and binding. How do we distribute power and consultation, while avoiding populist chaos?  

Alexander: That’s a large and important question, which I can probably answer only in part. One thing we need to be careful of is what kind of public we are talking about. In many cases, we view self-selection as a bad thing — it can’t be representative. However, in cases like Wikipedia, we see self-selected individuals with specialized knowledge and an uncommon level of interest collaborating. I would suggest that there is an important difference between the kind of decisions that are made by careful and informed participants in citizens’ juries, deliberative polls, or Wikipedia editing, and the oversimplified binary choices that we make in elections or referendums.

So, while there is research to suggest that large numbers of ordinary people can make better decisions, there are some conditions in terms of prior knowledge and careful consideration attached to that. I have high hopes for these more deliberative forms of public participation, but we are right to be cautious about referendums. The Icelandic constitutional reform process actually involved several forms of public participation, including two randomly selected deliberative fora, self-selected online participation, and a popular referendum with several questions.

Ed: A constitution is a very technical piece of text: how much could non-experts realistically contribute to its development — or was there also contribution from specialised interest groups? Presumably there was a team of lawyers and drafters managing the process? 

Alexander: All of these things were going on in Iceland’s drafting process. In my research here and on a few other constitution-making processes in other countries, I’ve been impressed by the ability of citizens to engage at a high level with fundamental questions about the nature of the state, constitutional rights, and legal theory. Assuming a reasonable level of literacy, people are fully capable of reading some literature on constitutional law and political philosophy, and writing very well-informed submissions that express what they would like to see in the constitutional text. A small, self-selected set of the public in many countries seeks to engage in spirited and for the most part respectful debate on these issues. In the Icelandic case, these debates have continued from 2009 to the present.

I would also add that public interest is not distributed uniformly across all the topics that constitutions cover. Members of the public show much more interest in discussing issues of human rights, and have more success in seeing proposals on that theme included in the draft constitution. Some NGOs were involved in submitting proposals to the Icelandic Constitutional Council, but interest groups do not appear to have been a major factor in the process. Unlike some constitution-making processes, the Icelandic Constitutional Council had a limited staff, and the drafters themselves were very engaged with the public on social media.

Ed: I guess Iceland is fairly small, but also unusually homogeneous. That helps, presumably, in creating a general consensus across a society? Or will party / political leaning always tend to trump any sense of common purpose and destiny, when defining the form and identity of the nation?

Alexander: You are certainly right that Iceland is unusual in these respects, and this raises important questions of what this is a case of, and how the findings here can inform us about what might happen in other contexts. I would not say that the Icelandic people reached any sort of broad, national-level consensus about how the constitution should change. During the early part of the drafting process, it seems that those who had strong disagreements with what was taking place absented themselves from the proceedings. They did turn up later to some extent (especially after the 2012 referendum), and sought to prevent this draft from becoming law.

Where the small size and homogeneous population really came into play in Iceland is through the level of knowledge that those who participated had of one another before entering into the constitution-making process. While this has been over emphasized in some discussions of Iceland, there are communities of shared interests where people all seem to know each other, or at least know of each other. This makes forming new societies, NGOs, or interest groups easier, and probably helped to launch the constitution-making project in the first place. 

Ed: How many people were involved in the process — and how were bad suggestions rejected, discussed, or improved? I imagine there must have been divisive issues, that someone would have had to arbitrate? 

Alexander: The number of people who interacted with the process in some way, either by attending one of the public forums that took place early in the process, voting in the election for the Constitutional Council, or engaging with the process on social media, is certainly in the tens of thousands. In fact, one of the striking things about this case is that 522 people stood for election to the 25 member Constitutional Council which drafted the new constitution. So there was certainly a high level of interest in participating in this process.

My research here focused on the written proposals that were posted to the Constitutional Council’s website. 204 individuals participated in that more intensive way. As the members of the Constitutional Council tell it, they would read some of the comments on social media, and the formal submissions on their website during their committee meetings, and discuss amongst themselves which ideas should be carried forward into the draft. The vast majority of the submissions were well-informed, on topic, and conveyed a collegial tone. In this case at least, there was very little of the kind of abusive participation that we observe in some online networks. 

Ed: You say that despite the success in creating a crowd-sourced constitution (that passed a public referendum), it was never ratified by parliament — why is that? And what lessons can we learn from this?

Alexander: Yes, this is one of the most interesting aspects of the whole thing for scholars, and certainly a source of some outrage for those Icelanders who are still active in trying to see this draft constitution become law. Some of this relates to the specifics of Iceland’s constitutional amendment process (which disincentives parliament from approving changes in between elections), but I think that there are also a couple of broadly applicable things going on here. First, the constitution-making process arose as a response to the way that the Icelandic government was perceived to have failed in governing the financial system in the late 2000s. By the time a last-ditch attempt to bring the draft constitution up for a vote in parliament occurred right before the 2013 election, almost five years had passed since the crisis that began this whole saga, and the economic situation had begun to improve. So legislators were not feeling pressure to address those issues any more.

Second, since political parties were not active in the drafting process, too few members of parliament had a stake in the issue. If one of the larger parties had taken ownership of this draft constitution, we might have seen a different outcome. I think this is one of the most important lessons from this case: if the success of the project depends on action by elite political actors, they should be involved in the earlier stages of the process. For various reasons, the Icelanders chose to exclude professional politicians from the process, but that meant that the Constitutional Council had too few friends in parliament to ratify the draft.

Read the full article: Hudson, A. (2018) When Does Public Participation Make a Difference? Evidence From Iceland’s Crowdsourced Constitution. Policy & Internet 10 (2) 185-217. DOI: https://doi.org/10.1002/poi3.167

Alexander Hudson was talking to blog editor David Sutcliffe.

]]>
Do Finland’s digitally crowdsourced laws show a way to resolve democracy’s “legitimacy crisis”? https://ensr.oii.ox.ac.uk/do-finlands-digitally-crowdsourced-laws-show-a-way-to-resolve-democracys-legitimacy-crisis/ Mon, 16 Nov 2015 12:29:29 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3475 There is much discussion about a perceived “legitimacy crisis” in democracy. In his article The Rise of the Mediating Citizen: Time, Space, and Citizenship in the Crowdsourcing of Finnish Legislation, Taneli Heikka (University of Jyväskylä) discusses the digitally crowdsourced law for same-sex marriage that was passed in Finland in 2014, analysing how the campaign used new digital tools and created practices that affect democratic citizenship and power making.

Ed: There is much discussion about a perceived “legitimacy crisis” in democracy. For example, less than half of the Finnish electorate under 40 choose to vote. In your article you argue that Finland’s 2012 Citizens’ Initiative Act aimed to address this problem by allowing for the crowdsourcing of ideas for new legislation. How common is this idea? (And indeed, how successful?)

Taneli: The idea that digital participation could counter the “legitimacy crisis” is a fairly common one. Digital utopians have nurtured that idea from the early years of the internet, and have often been disappointed. A couple of things stand out in the Finnish experiment that make it worth a closer look.

First, the digital crowdsourcing system with strong digital identification is a reliable and potentially viral campaigning tool. Most civic initiative systems I have encountered rely on manual or otherwise cumbersome, and less reliable, signature collection methods.

Second, in the Finnish model, initiatives that break the threshold of 50,000 names must be treated in the Parliament equally to an initiative from a group of MPs. This gives the initiative constitutional and political weight.

Ed: The Act led to the passage of Finland’s first equal marriage law in 2014. In this case, online platforms were created for collecting signatures as well as drafting legislation. An NGO created a well-used platform, but it subsequently had to shut it down because it couldn’t afford the electronic signature system. Crowds are great, but not a silver bullet if something as prosaic as authentication is impossible. Where should the balance lie between NGOs and centrally funded services, i.e. government?

Taneli: The crucial thing in the success of a civic initiative system is whether it gives the people real power. This question is decided by the legal framework and constitutional basis of the initiative system. So, governments have a very important role in this early stage – designing a law for truly effective citizen initiatives.

When a framework for power-making is in place, service providers will emerge. Should the providers be public, private or third sector entities? I think that is defined by local political culture and history.

In the United States, the civic technology field is heavily funded by philanthropic foundations. There is an urge to make these tools commercially viable, though no one seems to have figured out the business model. In Europe there’s less philanthropic money, and in my experience experiments are more often government funded.

Both models have their pros and cons, but I’d like to see the two continents learning more from each other. American digital civic activists tell me enviously that the radically empowering Finnish model with a government-run service for crowdsourcing for law would be impossible in the US. In Europe, civic technologists say they wish they had the big foundations that Americans have.

Ed: But realistically, how useful is the input of non-lawyers in (technical) legislation drafting? And is there a critical threshold of people necessary to draft legislation?

Taneli: I believe that input is valuable from anyone who cares to invest some time in learning an issue. That said, having lawyers in the campaign team really helps. Writing legislation is a special skill. It’s a pity that the co-creation features in Finland’s Open Ministry website were shut down due to a lack of funding. In that model, help from lawyers could have been made more accessible for all campaign teams.

In terms of numbers, I don’t think the size of the group is an issue either way. A small group of skilled and committed people can do a lot in the drafting phase.

Ed: But can the drafting process become rather burdensome for contributors, given professional legislators will likely heavily rework, or even scrap, the text?

Taneli: Professional legislators will most likely rework the draft, and that is exactly what they are supposed to do. Initiating an idea, working on a draft, and collecting support for it are just phases in a complex process that continues in the parliament after the threshold of 50,000 signatures is reached. A well-written draft will make the legislators’ job easier, but it won’t replace them.

Ed: Do you think there’s a danger that crowdsourcing legislation might just end up reflecting the societal concerns of the web-savvy – or of campaigning and lobbying groups

Taneli: That’s certainly a risk, but so far there is little evidence of it happening. The only initiative passed so far in Finland – the Equal Marriage Act – was supported by the majority of Finns and by the majority of political parties, too. The initiative system was used to bypass a political gridlock. The handful of initiatives that have reached the 50,000 signatures threshold and entered parliamentary proceedings represent a healthy variety of issues in the fields of education, crime and punishment, and health care. Most initiatives seem to echo the viewpoint of the ‘ordinary people’ instead of lobbies or traditional political and business interest groups.

Ed: You state in your article that the real-time nature of digital crowdsourcing appeals to a generation that likes and dislikes quickly; a generation that inhabits “the space of flows”. Is this a potential source of instability or chaos? And how can this rapid turnover of attention be harnessed efficiently so as to usefully contribute to a stable and democratic society?

Taneli: The Citizens’ Initiative Act in Finland is one fairly successful model to look at in terms of balancing stability and disruptive change. It is a radical law in its potential to empower the individual and affect real power-making. But it is by no means a shortcut to ‘legislation by a digital mob’, or anything of that sort. While the digital campaigning phase can be an explosive expression of the power of the people in the ‘time and space of flows’, the elected representatives retain the final say. Passing a law is still a tedious process, and often for good reasons.

Ed: You also write about the emergence of the “mediating citizen” – what do you mean by this?

Taneli: The starting point for developing the idea of the mediating citizen is Lance Bennett’s AC/DC theory, i.e. the dichotomy of the actualising and the dutiful citizen. The dutiful citizen is the traditional form of democratic citizenship – it values voting, following the mass media, and political parties. The actualising citizen, on the other hand, finds voting and parties less appealing, and prefers more flexible and individualised forms of political action, such as ad hoc campaigns and the use of interactive technology.

I find these models accurate but was not able to place in this duality the emerging typologies of civic action I observed in the Finnish case. What we see is understanding and respect for parliamentary institutions and their power, but also strong faith in one’s skills and capability to improve the system in creative, technologically savvy ways. I used the concept of the mediating citizen to describe an actor who is able to move between the previous typologies, mediating between them. In the Finnish example, creative tools were developed to feed initiatives in the traditional power-making system of the parliament.

Ed: Do you think Finland’s Citizens Initiative Act is a model for other governments to follow when addressing concerns about “democratic legitimacy”?

Taneli: It is an interesting model to look at. But unfortunately the ‘legitimacy crisis’ is probably too complex a problem to be solved by a single participation tool. What I’d really like to see is a wave of experimentation, both on-line and off-line, as well as cross-border learning from each other. And is that not what happened when the representative model spread, too?

Read the full article: Heikka, T., (2015) The Rise of the Mediating Citizen: Time, Space, and Citizenship in the Crowdsourcing of Finnish Legislation. Policy and Internet 7 (3) 268–291.


Taneli Heikka is a journalist, author, entrepreneur, and PhD student based in Washington.

Taneli Heikka was talking to Blog Editor Pamina Smith.

]]>
Assessing crowdsourcing technologies to collect public opinion around an urban renovation project https://ensr.oii.ox.ac.uk/assessing-crowdsourcing-technologies-to-collect-public-opinion-around-an-urban-renovation-project/ Mon, 09 Nov 2015 11:20:50 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3453 Ed: Given the “crisis in democratic accountability”, methods to increase citizen participation are in demand. To this end, your team developed some interactive crowdsourcing technologies to collect public opinion around an urban renovation project in Oulu, Finland. What form did the consultation take, and how did you assess its impact?

Simo: Over the years we’ve deployed various types of interactive interfaces on a network of public displays. In this case it was basically a network of interactive screens deployed in downtown Oulu, next to where a renovation project was happening that we wanted to collect feedback about. We deployed an app on the screens, that allowed people to type feedback direcly on the screens (on-screen soft keyboard), and submit feedback to city authorities via SMS, Twitter and email. We also had a smiley-based “rating” system there, which people could us to leave quick feedback about certain aspects of the renovation project.

We ourselves could not, and did not even want to, assess the impact — that’s why we did this in partnership with the city authorities. Then, together with the city folks we could better evaluate if what we were doing had any real-world value whatsoever. And, as we discuss, in the end it did!

Ed: How did you go about encouraging citizens to engage with touch screen technologies in a public space — particularly the non-digitally literate, or maybe people who are just a bit shy about participating?

Simo: Actually, the whole point was that we did not deliberately encourage them by advertising the deployment or by “forcing” anyone to use it. Quite to the contrary: we wanted to see if people voluntarily used it, and the technologies that are an integral part of the city itself. This is kind of the future vision of urban computing, anyway. The screens had been there for years already, and what we wanted to see is if people find this type of service on their own when exploring the screens, and if they take the opportunity to then give feedback using them. The screens hosted a variety of other applications as well: games, news, etc., so it was interesting to also gauge how appealing the idea of public civic feedback is in comparison to everything else that was being offered.

Ed: You mention that using SMS to provide citizen feedback was effective in filtering out noise since it required a minimal payment from citizens — but it also created an initial barrier to participation. How do you increase the quality of feedback without placing citizens on different-level playing fields from the outset — particularly where technology is concerned?

Simo: Yes, SMS really worked well in lowering the amount of irrelevant commentary and complete nonsense. And it is true that SMS already introduces a cost, and even if the cost is miniscule, it’s still a cost to the citizen — and just voicing one’s opinions should of course be free. So there’s no correct answer here — if the channel is public and publicly accessible to anyone, there will be a lot of noisy input. In such cases moderation is a heavy task, and to this end we have been exploring crowdsourcing as well. We can make the community moderate itself. First, we need to identify the users who are genuinely concerned or interested about the issues being explored, and then funnel those users to moderate the discussion / output. It is a win-win situation — the people who want to get involved are empowered to moderate the commentary from others, for implicit rewards.

Ed: For this experiment on citizen feedback in an urban space, your team assembled the world’s largest public display network, which was available for research purposes 24/7. In deploying this valuable research tool, how did you guarantee the privacy of the participants involved, given that some might not want to be seen submitting very negative comments? (e.g. might a form of social pressure be the cause of relatively low participation in the study?)

Simo: The display network was not built only for this experiment, but we have run hundreds of experiments on it, and have written close to a hundred academic papers about them. So, the overarching research focus, really, is on how we can benefit citizens using the network. Over the years we have been able to systematically study issues such as social pressure, group use, effects of the public space, or, one might say “stage”, etc. And yes, social pressure does affect a lot, and for this allowing people participate via e.g. SMS or email helps a lot. That way the users won’t be seen sending the input directly.

Group use is another thing: in groups people don’t feel pressure from the “outside world” so much and are willing to interact with our applications (such as the one documented in this work), but, again, it affects the feedback quality. Groups don’t necessarily tell the truth as they aim for consensus. So the individual, and very important, opinions may not become heard. Ultimately, this is all just part of the game we must deal with, and the real question becomes how to minimize those negative effects that the public space introduces. The positives are clear: everyone can participate, easily, in the heart of the city, and whenever they want.

Ed: Despite the low participation, you still believe that the experimental results are valuable. What did you learn?

Simo: The question in a way already reveals the first important point: people are just not as interested in these “civic” things as they might claim in interviews and pre-studies. When we deploy a civic feedback prototype as the “only option” on a public gizmo (a display, some kind of new tech piece, etc.), people out of curiosity use it. Now, in our case, we just deploy it “as is”, as part of the city infrastructure for people to use if, and only if, they want to use it. So, the prototype competes for attention against smartphones, other applications on the displays, the cluttered city itself… everything!

When one reads many academic papers on interactive civic engagement prototypes, the assumptions are set very high in the discussion: “we got this much participation in this short time”, etc., but that’s not the entire truth. Leave the thing there for months and see if it still interests people! We have done the same, deployed a prototype for three days, gotten tons of interaction, published it, and learned only afterwards that “oh, maybe we were a bit optimistic with the efficiency” when the use suddenly dropped to minimum. It’s just not that easy and the application require frequent updates to keep user interest longitudinally.

Also, the radical differences in the feedback channels were surprising, but we already talked about that a bit earlier.

Ed: Your team collaborated with local officials, which is obviously valuable (and laudable), but it can potentially impose an extra burden on academics. For example, you mention that instead of employing novel feedback formats (e.g. video, audio, images, interactive maps), your team used only text. But do you think working with public officials benefitted the project as a whole, and how?

Simo: The extra burden is a necessity if one wants to really claim authentic success in civic engagement. In our opinion, it only happens between citizens and the city, not between citizens and researchers. We do not wish to build these deployments for the sake of an academic article or two: the display infrastructure is there for citizens and the city, and if we don’t educate the authorities on how to use it then nobody will. Advertisers would be glad to take over the entire real estate there, so in a way this project is just a part of the bigger picture. Which is making the display infrastructure “useful” instead of just a gimmick to kill time with (games) or for advertising.

And yes, the burden is real, but also because of this we could document what we have learned about dealing with authorities: how it is first easy to sell these prototypes to them, but sometimes hard to get commitment, etc. And it is not just this prototype — we’ve done a number of other civic engagement projects where we have noticed the same issues mentioned in the paper as well.

Ed: You also mention that as academics and policymakers you had different notions of success: for example in terms of levels of engagement and feedback of citizens. What should academics aspiring to have a concrete impact on society keep in mind when working with policymakers?

Simo: It takes a lot of time to assess impact. Policymakers will not be able to say after only a few weeks (which is the typical length of studies in our field) if the prototype has actual value to it, or if it’s just a “nice experiment”. So, deploy your strategy / tech / anything you’re doing, write about it, and let it sit. Move on with the life, and then revisit it after months to see if anything has come out of it! Patience is key here.

Ed: Did the citizen feedback result in any changes to the urban renovation project they were being consulted on?

Simo: Not the project directly: the project naturally was planned years ahead and the blueprints were final at that point. The most remarkable finding for us (and the authorities) was that after moderating the noise out from the feedback, the remaining insight was pretty much the only feedback that they ever directly got from citizens. Finns tend to be a bit on the shy side, so people won’t just pick up the phone and call the local engineering department and speak out. Not sure if anyone does, really? So they complain and chat on forums and coffee tables. So it would require active work for the authorities to find and reach out to these people.

With the display infrastructure, which was already there, we were able to gauge the public opinion that did not affect the construction directly, but indirectly affected how the department could manage their press releases, which things to stress in public communications, what parts of PR to handle differently in the next stage of the renovation project etc.

Ed: Are you planning any more experiments?

Simo: We are constantly running quite a few experiments. On the civic engagement side, for example, we are investigating how to gamify environmental awareness (recycling, waste management, keeping the environment clean) for children, as well as running longer longitudinal studies to assess the engagement of specify groups of people (e.g., children and the elderly).

Read the full article: Hosio, S., Goncalves, J., Kostakos, V. and Riekki, J. (2015) Crowdsourcing Public Opinion Using Urban Pervasive Technologies: Lessons From Real-Life Experiments in Oulu. Policy and Internet 7 (2) 203–222.


Simon Hosio is a research scientist (Dr. Tech.) at the University of Oulu, in Finland. Core topics of his research are smart city tech, crowdsourcing, wisdom of the crowd, civic engagement, and all types of “mobile stuff” in general.

Simo Hosio was talking to blog editor Pamina Smith.

]]>
Crowdsourcing ideas as an emerging form of multistakeholder participation in Internet governance https://ensr.oii.ox.ac.uk/crowdsourcing-ideas-as-an-emerging-form-of-multistakeholder-participation-in-internet-governance/ Wed, 21 Oct 2015 11:59:56 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3445 What are the linkages between multistakeholder governance and crowdsourcing? Both are new — trendy, if you will — approaches to governance premised on the potential of collective wisdom, bringing together diverse groups in policy-shaping processes. Their interlinkage has remained underexplored so far. Our article recently published in Policy and Internet sought to investigate this in the context of Internet governance, in order to assess the extent to which crowdsourcing represents an emerging opportunity of participation in global public policymaking.

We examined two recent Internet governance initiatives which incorporated crowdsourcing with mixed results: the first one, the ICANN Strategy Panel on Multistakeholder Innovation, received only limited support from the online community; the second, NETmundial, had a significant number of online inputs from global stakeholders who had the opportunity to engage using a platform for political participation specifically set up for the drafting of the outcome document. The study builds on these two cases to evaluate how crowdsourcing was used as a form of public consultation aimed at bringing the online voice of the “undefined many” (as opposed to the “elected few”) into Internet governance processes.

From the two cases, it emerged that the design of the consultation processes conducted via crowdsourcing platforms is key in overcoming barriers of participation. For instance, in the NETmundial process, the ability to submit comments and participate remotely via www.netmundial.br attracted inputs from all over the world very early on, since the preparatory phase of the meeting. In addition, substantial public engagement was obtained from the local community in the drafting of the outcome document, through a platform for political participation — www.participa.br — that gathered comments in Portuguese. In contrast, the outreach efforts of the ICANN Strategy Panel on Multistakeholder Innovation remained limited; the crowdsourcing platform they used only gathered input (exclusively in English) from a small group of people, insufficient to attribute to online public input a significant role in the reform of ICANN’s multistakeholder processes.

Second, questions around how crowdsourcing should and could be used effectively to enhance the legitimacy of decision-making processes in Internet governance remain unanswered. A proper institutional setting that recognizes a role for online multistakeholder participation is yet to be defined; in its absence, the initiatives we examined present a set of procedural limitations. For instance, in the NETmundial case, the Executive Multistakeholder Committee, in charge of drafting an outcome document to be discussed during the meeting based on the analysis of online contributions, favoured more “mainstream” and “uncontroversial” contributions. Additionally, online deliberation mechanisms for different propositions put forward by a High-Level Multistakeholder Committee, which commented on the initial draft, were not in place.

With regard to ICANN, online consultations have been used on a regular basis since its creation in 1998. Its target audience is the “ICANN community,” a group of stakeholders that volunteer their time and expertise to improve policy processes within the organization. Despite the effort, initiatives such as the 2000 global election for the new At-Large Directors have revealed difficulties in reaching as broad of an audience as wanted. Our study discusses some of the obstacles of the implementation of this ambitious initiative, including limited information and awareness about the At-Large elections, and low Internet access and use in most developing countries, particularly in Africa and Latin America.

Third, there is a need for clear rules regarding the way in which contributions are evaluated in crowdsourcing efforts. When the deliberating body (or committee) is free to disregard inputs without providing any motivation, it triggers concerns about the broader transnational governance framework in which we operate, as there is no election of those few who end up determining which parts of the contributions should be reflected in the outcome document. To avoid the agency problem arising from the lack of accountability over the incorporation of inputs, it is important that crowdsourcing attempts pay particular attention to designing a clear and comprehensive assessment process.

The “wisdom of the crowd” has traditionally been explored in developing the Internet, yet it remains a contested ground when it comes to its governance. In multistakeholder set-ups, the diversity of voices and the collection of ideas and input from as many actors as possible — via online means — represent a desideratum, rather than a reality. In our exploration of empowerment through online crowdsourcing for institutional reform, we identify three fundamental preconditions: first, the existence of sufficient community interest, able to leverage wide expertise beyond a purely technical discussion; second, the existence of procedures for the collection and screening of inputs, streamlining certain ideas considered for implementation; and third, commitment to institutionalizing the procedures, especially by clearly defining the rules according to which feedback is incorporated and circumvention is avoided.

Read the full paper: Radu, R., Zingales, N. and Calandro, E. (2015), Crowdsourcing Ideas as an Emerging Form of Multistakeholder Participation in Internet Governance. Policy & Internet, 7: 362–382. doi: 10.1002/poi3.99


Roxana Radu is a PhD candidate in International Relations at the Graduate Institute of International and Development Studies in Geneva and a fellow at the Center for Media, Data and Society, Central European University (Budapest). Her current research explores the negotiation of internet policy-making in global and regional frameworks.

Nicolo Zingales is an assistant professor at Tilburg law school, a senior member of the Tilburg Law and Economics Center (TILEC), and a research associate of the Tilburg Institute for Law, Technology and Society (TILT). He researches on various aspects of Internet governance and regulation, including multistakeholder processes, data-driven innovation and the role of online intermediaries.

Enrico Calandro (PhD) is a senior research fellow at Research ICT Africa, an ICT policy think-tank based based in Cape Town. His academic research focuses on accessibility and affordability of ICT, broadband policy, and internet governance issues from an African perspective.

]]>
Uber and Airbnb make the rules now — but to whose benefit? https://ensr.oii.ox.ac.uk/uber-and-airbnb-make-the-rules-now-but-to-whose-benefit/ Mon, 27 Jul 2015 07:12:20 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3319 The "Airbnb Law" was signed by Mayor Ed Lee in October 2014 at San Francisco City Hall, legalizing short-term rentals in SF with many conditions. Image by Kevin Krejci (Flickr).
The “Airbnb Law” was signed by Mayor Ed Lee in October 2014 at San Francisco City Hall, legalizing short-term rentals in SF with many conditions. Image of protesters by Kevin Krejci (Flickr).

Ride-hailing app Uber is close to replacing government-licensed taxis in some cities, while Airbnb’s accommodation rental platform has become a serious competitor to government-regulated hotel markets. Many other apps and platforms are trying to do the same in other sectors of the economy. In my previous post, I argued that platforms can be viewed in social science terms as economic institutions that provide infrastructures necessary for markets to thrive. I explained how the natural selection theory of institutional change suggests that people are migrating from state institutions to these new code-based institutions because they provide a more efficient environment for doing business. In this article, I will discuss some of the problems with this theory, and outline a more nuanced theory of institutional change that suggests that platforms’ effects on society will be complex and influence different people in different ways.

Economic sociologists like Neil Fligstein have pointed out that not everyone is as free to choose the means through which they conduct their trade. For example, if buyers in a market switch to new institutions, sellers may have little choice but to follow, even if the new institutions leave them worse off than the old ones did. Even if taxi drivers don’t like Uber’s rules, they may find that there is little business to be had outside the platform, and switch anyway. In the end, the choice of institutions can boil down to power. Economists have shown that even a small group of participants with enough market power — like corporate buyers — may be able to force a whole market to tip in favour of particular institutions. Uber offers a special solution for corporate clients, though I don’t know if this has played any part in the platform’s success.

Even when everyone participates in an institutional arrangement willingly, we still can’t assume that it will contribute to the social good. Cambridge economic historian Sheilagh Ogilvie has pointed out that an institution that is efficient for everyone who participates in it can still be inefficient for society as a whole if it affects third parties. For example, when Airbnb is used to turn an ordinary flat into a hotel room, it can cause nuisance to neighbours in the form of noise, traffic, and guests unfamiliar with the local rules. The convenience and low cost of doing business through the platform is achieved in part at others’ expense. In the worst case, a platform can make society not more but less efficient — by creating a ‘free rider economy’.

In general, social scientists recognize that different people and groups in society often have conflicting interests in how economic institutions are shaped. These interests are reconciled — if they are reconciled — through political institutions. Many social scientists thus look not so much at efficiencies but at political institutions to understand why economic institutions are shaped the way they are. For example, a democratic local government in principle represents the interests of its citizens, through political institutions such as council elections and public consultations. Local governments consequently try to strike a balance between the conflicting interests of hoteliers and their neighbours, by limiting hotel business to certain zones. In contrast, Airbnb as a for-profit business must cater to the interests of its customers, the would-be hoteliers and their guests. It has no mechanism, and more importantly, no mandate, to address on an equal footing the interests of third parties like customers’ neighbours. Perhaps because of this, 74% of Airbnb’s properties are not in the main hotel districts, but in ordinary residential blocks.

That said, governments have their own challenges in producing fair and efficient economic institutions. Not least among these is the fact that government regulators are at a risk of capture by incumbent market participants, or at the very least they face the innovator’s dilemma: it is easier to craft rules that benefit the incumbents than rules that provide great but uncertain benefits to future market participants. For example, cities around the world operate taxi licensing systems, where only strictly limited numbers of license owners are allowed to operate taxicabs. Whatever benefits this system offers to customers in terms of quality assurance, among its biggest beneficiaries are the license owners, and among its losers the would-be drivers who are excluded from the market. Institutional insiders and outsiders have conflicting interests, and government political institutions are often such that it is easier for it to side with the insiders.

Against this background, platforms appear almost as radical reformers that provide market access to those whom the establishment has denied it. For example, Uber recently announced that it aims to create one million jobs for women by 2020, a bold pledge in the male-dominated transport industry, and one that would likely not be possible if it adhered to government licensing requirements, as most licenses are owned by men. Having said that, Uber’s definition of a ‘job’ is something much more precarious and entrepreneurial than the conventional definition. My point here is not to side with either Uber or the licensing system, but to show that their social implications are very different. Both possess at least some flaws as well as redeeming qualities, many of which can be traced back to their political institutions and whom they represent.

What kind of new economic institutions are platform developers creating? How efficient are they? What other consequences, including unintended ones, do they have and to whom? Whose interests are they geared to represent — capital vs. labour, consumer vs. producer, Silicon Valley vs. local business, incumbent vs. marginalized? These are the questions that policy makers, journalists, and social scientists ought to be asking at this moment of transformation in our economic institutions. Instead of being forced to choose one or the other between established institutions and platforms as they currently are, I hope that we will be able to discover ways to take what is good in both, and create infrastructure for an economy that is as fair and inclusive as it is efficient and innovative.


Vili Lehdonvirta is a Research Fellow and DPhil Programme Director at the Oxford Internet Institute, and an editor of the Policy & Internet journal. He is an economic sociologist who studies the social and economic dimensions of new information technologies around the world, with particular expertise in digital markets and crowdsourcing.

]]>
Why are citizens migrating to Uber and Airbnb, and what should governments do about it? https://ensr.oii.ox.ac.uk/why-are-citizens-migrating-to-uber-and-airbnb-and-what-should-governments-do-about-it/ Mon, 27 Jul 2015 06:48:57 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3307 protested fair taxi laws by parking in Pioneer square. Organizers want city leaders to make ride-sharing companies play by the same rules as cabs and Town cars. Image: Aaron Parecki (Flickr).
Protest for fair taxi laws in Portland; organizers want city leaders to make ride-sharing companies play by the same rules as cabs and Town cars. Image: Aaron Parecki (Flickr).

Cars were smashed and tires burned in France last month in protests against the ride hailing app Uber. Less violent protests have also been staged against Airbnb, a platform for renting short-term accommodation. Despite the protests, neither platform shows any signs of faltering. Uber says it has a million users in France, and is available in 57 countries. Airbnb is available in over 190 countries, and boasts over a million rooms, more than hotel giants like Hilton and Marriott. Policy makers at the highest levels are starting to notice the rise of these and similar platforms. An EU Commission flagship strategy paper notes that “online platforms are playing an ever more central role in social and economic life,” while the Federal Trade Commission recently held a workshop on the topic in Washington.

Journalists and entrepreneurs have been quick to coin terms that try to capture the essence of the social and economic changes associated with online platforms: the sharing economy; the on-demand economy; the peer-to-peer economy; and so on. Each perhaps captures one aspect of the phenomenon, but doesn’t go very far in helping us make sense of all its potentials and contradictions, including why some people love it and some would like to smash it into pieces. Instead of starting from the assumption that everything we see today is new and unprecedented, what if we dug into existing social science theory to see what it has to say about economic transformation and the emergence of markets?

Economic sociologists are adamant that markets don’t just emerge by themselves: they are always based on some kind of an underlying infrastructure that allows people to find out what goods and services are on offer, agree on prices and terms, pay, and have a reasonable expectation that the other party will honour the agreement. The oldest market infrastructure is the personal social network: traders hear what’s on offer through word of mouth and trade only with those whom they personally know and trust. But personal networks alone couldn’t sustain the immense scale of trading in today’s society. Every day we do business with strangers and trust them to provide for our most basic needs. This is possible because modern society has developed institutions — things like private property, enforceable contracts, standardized weights and measures, consumer protection, and many other general and sector specific norms and facilities. By enabling and constraining everyone’s behaviours in predictable ways, institutions constitute a robust and more inclusive infrastructure for markets than personal social networks.

Modern institutions didn’t of course appear out of nowhere. Between prehistoric social networks and the contemporary institutions of the modern state, there is a long historical continuum of economic institutions, from ancient trade routes with their customs to medieval fairs with their codes of conduct to state-enforced trade laws of the early industrial era. Institutional economists led by Oliver Williamson and economic historians led by Douglass North theorized in the 1980s that economic institutions evolve towards more efficient forms through a process of natural selection. As new institutional forms become possible thanks to technological and organizational innovation, people switch to cheaper, easier, more secure, and overall more efficient institutions out of self-interest. Old and cumbersome institutions fall into disuse, and society becomes more efficient and economically prosperous as a result. Williamson and North both later received the Nobel Memorial Prize in Economic Sciences.

It is easy to frame platforms as the next step in such an evolutionary process. Even if platforms don’t replace state institutions, they can plug gaps that remain the state-provided infrastructure. For example, enforcing a contract in court is often too expensive and unwieldy to be used to secure transactions between individual consumers. Platforms provide cheaper and easier alternatives to formal contract enforcement, in the form of reputation systems that allow participants to rate each others’ conduct and view past ratings. Thanks to this, small transactions like sharing a commute that previously only happened in personal networks can now potentially take place on a wider scale, resulting in greater resource efficiency and prosperity (the ‘sharing economy’). Platforms are not the first companies to plug holes in state-provided market infrastructure, though. Private arbitrators, recruitment agencies, and credit rating firms have been doing similar things for a long time.

What’s arguably new about platforms, though, is that some of the most popular ones are not mere complements, but almost complete substitutes to state-provided market infrastructures. Uber provides a complete substitute to government-licensed taxi infrastructures, addressing everything from quality and discovery to trust and payment. Airbnb provides a similarly sweeping solution to short-term accommodation rental. Both platforms have been hugely successful; in San Francisco, Uber has far surpassed the city’s official taxi market in size. The sellers on these platforms are not just consumers wanting to make better use of their resources, but also firms and professionals switching over from the state infrastructure. It is as if people and companies were abandoning their national institutions and emigrating en masse to Platform Nation.

From the natural selection perspective, this move from state institutions to platforms seems easy to understand. State institutions are designed by committee and carry all kinds of historical baggage, while platforms are designed from the ground up to address their users’ needs. Government institutions are geographically fragmented, while platforms offer a seamless experience from one city, country, and language area to the other. Government offices have opening hours and queues, while platforms make use of latest technologies to provide services around the clock (the ‘on-demand economy’). Given the choice, people switch to the most efficient institutions, and society becomes more efficient as a result. The policy implications of the theory are that government shouldn’t try to stop people from using Uber and Airbnb, and that it shouldn’t try to impose its evidently less efficient norms on the platforms. Let competing platforms innovate new regulatory regimes, and let people vote with their feet; let there be a market for markets.

The natural selection theory of institutional change provides a compellingly simple way to explain the rise of platforms. However, it has difficulty in explaining some important facts, like why economic institutions have historically developed differently in different places around the world, and why some people now protest vehemently against supposedly better institutions. Indeed, over the years since the theory was first introduced, social scientists have discovered significant problems in it. Economic sociologists like Neil Fligstein have noted that not everyone is as free to choose the institutions that they use. Economic historian Sheilagh Ogilvie has pointed out that even institutions that are efficient for those who participate in them can still sometimes be inefficient for society as a whole. These points suggest a different theory of institutional change, which I will apply to online platforms in my next post.


Vili Lehdonvirta is a Research Fellow and DPhil Programme Director at the Oxford Internet Institute, and an editor of the Policy & Internet journal. He is an economic sociologist who studies the social and economic dimensions of new information technologies around the world, with particular expertise in digital markets and crowdsourcing.

]]>
How big data is breathing new life into the smart cities concept https://ensr.oii.ox.ac.uk/how-big-data-is-breathing-new-life-into-the-smart-cities-concept/ Thu, 23 Jul 2015 09:57:10 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3297 “Big data” is a growing area of interest for public policy makers: for example, it was highlighted in UK Chancellor George Osborne’s recent budget speech as a major means of improving efficiency in public service delivery. While big data can apply to government at every level, the majority of innovation is currently being driven by local government, especially cities, who perhaps have greater flexibility and room to experiment and who are constantly on a drive to improve service delivery without increasing budgets.

Work on big data for cities is increasingly incorporated under the rubric of “smart cities”. The smart city is an old(ish) idea: give urban policymakers real time information on a whole variety of indicators about their city (from traffic and pollution to park usage and waste bin collection) and they will be able to improve decision making and optimise service delivery. But the initial vision, which mostly centred around adding sensors and RFID tags to objects around the city so that they would be able to communicate, has thus far remained unrealised (big up front investment needs and the requirements of IPv6 are perhaps the most obvious reasons for this).

The rise of big data – large, heterogeneous datasets generated by the increasing digitisation of social life – has however breathed new life into the smart cities concept. If all the cars have GPS devices, all the people have mobile phones, and all opinions are expressed on social media, then do we really need the city to be smart at all? Instead, policymakers can simply extract what they need from a sea of data which is already around them. And indeed, data from mobile phone operators has already been used for traffic optimisation, Oyster card data has been used to plan London Underground service interruptions, sewage data has been used to estimate population levels … the examples go on.

However, at the moment these examples remain largely anecdotal, driven forward by a few cities rather than adopted worldwide. The big data driven smart city faces considerable challenges if it is to become a default means of policymaking rather than a conversation piece. Getting access to the right data; correcting for biases and inaccuracies (not everyone has a GPS, phone, or expresses themselves on social media); and communicating it all to executives remain key concerns. Furthermore, especially in a context of tight budgets, most local governments cannot afford to experiment with new techniques which may not pay off instantly.

This is the context of two current OII projects in the smart cities field: UrbanData2Decide (2014-2016) and NEXUS (2015-2017). UrbanData2Decide joins together a consortium of European universities, each working with a local city partner, to explore how local government problems can be resolved with urban generated data. In Oxford, we are looking at how open mapping data can be used to estimate alcohol availability; how website analytics can be used to estimate service disruption; and how internal administrative data and social media data can be used to estimate population levels. The best concepts will be built into an application which allows decision makers to access these concepts real time.

NEXUS builds on this work. A collaborative partnership with BT, it will look at how social media data and some internal BT data can be used to estimate people movement and traffic patterns around the city, joining these data into network visualisations which are then displayed to policymakers in a data visualisation application. Both projects fill an important gap by allowing city officials to experiment with data driven solutions, providing proof of concepts and showing what works and what doesn’t. Increasing academic-government partnerships in this way has real potential to drive forward the field and turn the smart city vision into a reality.


OII Resarch Fellow Jonathan Bright is a political scientist specialising in computational and ‘big data’ approaches to the social sciences. His major interest concerns studying how people get information about the political process, and how this is changing in the internet era.

]]>
After dinner: the best time to create 1.5 million dollars of ground-breaking science https://ensr.oii.ox.ac.uk/after-dinner-the-best-time-to-create-1-5-million-dollars-of-ground-breaking-science/ Fri, 24 Apr 2015 11:34:28 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3228
Count this! In celebration of the International Year of Astronomy 2009, NASA’s Great Observatories — the Hubble Space Telescope, the Spitzer Space Telescope, and the Chandra X-ray Observatory — collaborated to produce this image of the central region of our Milky Way galaxy. Image: Nasa Marshall Space Flight Center
Since it first launched as a single project called Galaxy Zoo in 2007, the Zooniverse has grown into the world’s largest citizen science platform, with more than 25 science projects and over 1 million registered volunteer citizen scientists. While initially focused on astronomy projects, such as those exploring the surfaces of the moon and the planet Mars, the platform now offers volunteers the opportunity to read and transcribe old ship logs and war diaries, identify animals in nature capture photos, track penguins, listen to whales communicating and map kelp from space.

These projects are examples of citizen science; collaborative research undertaken by professional scientists and members of the public. Through these projects, individuals who are not necessarily knowledgeable about or familiar with science can become active participants in knowledge creation (such as in the examples listed in the Chicago Tribune: Want to aid science? You can Zooniverse).

The Zooniverse is a predominant example of citizen science projects that have enjoyed particularly widespread popularity and traction online.

Although science-public collaborative efforts have long existed, the Zooniverse is a predominant example of citizen science projects that have enjoyed particularly widespread popularity and traction online. In addition to making science more open and accessible, online citizen science accelerates research by leveraging human and computing resources, tapping into rare and diverse pools of expertise, providing informal scientific education and training, motivating individuals to learn more about science, and making science fun and part of everyday life.

While online citizen science is a relatively recent phenomenon, it has attracted considerable academic attention. Various studies have been undertaken to examine and understand user behaviour, motivation, and the benefits and implications of different projects for them. For instance, Sauermann and Franzoni’s analysis of seven Zooniverse projects (Solar Stormwatch, Galaxy Zoo Supernovae, Galaxy Zoo Hubble, Moon Zoo, Old Weather, The Milkyway Project, and Planet Hunters) found that 60 percent of volunteers never return to a project after finishing their first session of contribution. By comparing contributions to these projects with those of research assistants and Amazon Mechanical Turk workers, they also calculated that these voluntary efforts amounted to an equivalent of $1.5 million in human resource costs.

Our own project on the taxonomy and ecology of contributions to the Zooniverse examines the geographical, gendered and temporal patterns of contributions and contributors to 17 Zooniverse projects between 2009 and 2013. Our preliminary results show that:

  • The geographical distribution of volunteers and contributions is highly uneven, with the UK and US contributing the bulk of both. Quantitative analysis of 130 countries show that of three factors – population, GDP per capita and number of Internet users – the number of Internet users is most strongly correlated with the number of volunteers and number of contributions. However, when population is controlled, GDP per capita is found to have greater correlation with numbers of users and volunteers. The correlations are positive, suggesting that wealthier (or more developed) countries are more likely to be involved in the citizen science projects.
The Global distribution of contributions to the projects within our dataset of 35 million records. The number of contributions of each country is normalized to the population of the country.
The Global distribution of contributions to the projects within our dataset of 35 million records. The number of contributions of each country is normalized to the population of the country.
  • Female volunteers are underrepresented in most countries. Very few countries have gender parity in participation. In many other countries, women make up less than one-third of number of volunteers whose gender is known. The female ratio of participation in the UK and Australia, for instance, is 25 per cent, while the figures for US, Canada and Germany are between 27 and 30 per cent. These figures are notable when compared with the percentage of academic jobs in the sciences held by women. In the UK, women make up only 30.3 percent of full time researchers in Science, Technology, Engineering and Mathematics (STEM) departments (UKRC report, 2010), and 24 per cent in the United States (US Department of Commerce report, 2011).
  • Our analysis of user preferences and activity show that in general, there is a strong subject preference among users, with two main clusters evident among users who participate in more than one project. One cluster revolves around astrophysics projects. Volunteers in these projects are more likely to take part in other astrophysics projects, and when one project ends, volunteers are more likely to start a new project within this cluster. Similarly, volunteers in the other cluster, which are concentrated around life and Earth science projects, have a higher likelihood of being involved in other life and Earth science projects than in astrophysics projects. There is less cross-project involvement between the two main clusters.
Dendrogram showing the overlap of contributors between projects. The scale indicates the similarity between the pools of contributors to pairs of projects. Astrophysics (blue) and Life-Earth Science (green and brown) projects create distinct clusters. Old Weather 1 and WhaleFM are exceptions to this pattern, and Old Weather 1 has the most distinct pool of contributors.
Dendrogram showing the overlap of contributors between projects. The scale indicates the similarity between the pools of contributors to pairs of projects. Astrophysics (blue) and Life-Earth Science (green and brown) projects create distinct clusters. Old Weather 1 and WhaleFM are exceptions to this pattern, and Old Weather 1 has the most distinct pool of contributors.
  • In addition to a tendency for cross-project activity to be contained within the same clusters, there is also a gendered pattern of engagement in various projects. Females make up more than half of gender-identified volunteers in life science projects (Snapshot Serengeti, Notes from Nature and WhaleFM have more than 50 per cent of women contributors). In contrast, the proportions of women are lowest in astrophysics projects (Galaxy Zoo Supernovae and Planet Hunters have less than 20 per cent of female contributors). These patterns suggest that science subjects in general are gendered, a finding that correlates with those by the US National Science Foundation (2014). According to an NSF report, there are relatively few women in engineering (13 per cent), computer and mathematical sciences (25 per cent), but they are well-represented in the social sciences (58 per cent) and biological and medical sciences (48 per cent).
  • For the 20 most active countries (led by the UK, US and Canada), the most productive hours in terms of user contributions are between 8pm and 10pm. This suggests that citizen science is an after-dinner activity (presumably, reflecting when most people have free time before bed). This general pattern corresponds with the idea that many types of online peer-production activities, such as citizen science, are driven by ‘cognitive surplus’, that is, the aggregation of free time spent on collective pursuits (Shirky, 2010).

These are just some of the results of our study, which has found that despite being informal, relatively more open and accessible, online citizen science exhibits similar geographical and gendered patterns of knowledge production as professional, institutional science. In other ways, citizen science is different. Unlike institutional science, the bulk of citizen science activity happens late in the day, after the workday has ended and people are winding down after dinner and before bed.

We will continue our investigations into the patterns of activity in citizen science and the behaviour of citizen scientists, in order to help improve ways to make science more accessible in general and to tap into the resources of the public for scientific knowledge production. It is anticipated that upcoming projects on the Zooniverse will be more diversified and include topics from the humanities and social sciences. Towards this end, we aim to continue our investigations into patterns of activity on the citizen science platform, and the implications of a wider range of projects on the user base (in terms of age, gender and geographical coverage) and on user behaviour.

References

Sauermann, H., & Franzoni, C. (2015). Crowd science user contribution patterns and their implications. Proceedings of the National Academy of Sciences112(3), 679-684.

Shirky, C. (2010). Cognitive surplus: Creativity and generosity in a connected age. Penguin: London.


Taha Yasseri is the Research Fellow in Computational Social Science at the OII. Prior to coming to the OII, he spent two years as a Postdoctoral Researcher at the Budapest University of Technology and Economics, working on the socio-physical aspects of the community of Wikipedia editors, focusing on conflict and editorial wars, along with Big Data analysis to understand human dynamics, language complexity, and popularity spread. He has interests in analysis of Big Data to understand human dynamics, government-society interactions, mass collaboration, and opinion dynamics.

]]>
Will digital innovation disintermediate banking — and can regulatory frameworks keep up? https://ensr.oii.ox.ac.uk/will-digital-innovation-disintermediate-banking-and-can-regulatory-frameworks-keep-up/ Thu, 19 Feb 2015 12:11:45 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3114
Many of Europe’s economies are hampered by a waning number of innovations, partially attributable to the European financial system’s aversion to funding innovative enterprises and initiatives. Image by MPD01605.
Innovation doesn’t just fall from the sky. It’s not distributed proportionately or randomly around the world or within countries, or found disproportionately where there is the least regulation, or in exact linear correlation with the percentage of GDP spent on R&D. Innovation arises in cities and countries, and perhaps most importantly of all, in the greatest proportion in ecosystems or clusters. Many of Europe’s economies are hampered by a waning number of innovations, partially attributable to the European financial system’s aversion to funding innovative enterprises and initiatives. Specifically, Europe’s innovation finance ecosystem lacks the necessary scale, plurality, and appetite for risk to drive investments in long-term initiatives aiming to produce a disruptive new technology. Such long-term investments are taking place more in the rising economies of Asia than in Europe.

While these problems could be addressed by new approaches and technologies for financing dynamism in Europe’s economies, financing of (potentially risky) innovation could also be held back by financial regulation that focuses on stability, avoiding forum shopping (i.e., looking for the most permissive regulatory environment), and preventing fraud, to the exclusion of other interests, particularly innovation and renewal. But the role of finance in enabling the development and implementation of new ideas is vital — an economy’s dynamism depends on innovative competitors challenging, and if successful, replacing complacent players in the markets.

However, newcomers obviously need capital to grow. As a reaction to the markets having priced risk too low before the financial crisis, risk is now being priced too high in Europe, starving the innovation efforts of private financing at a time when much public funding has suffered from austerity measures. Of course, complementary (non-bank) sources of finance can also help fund entrepreneurship, and without that petrol of money, the engine of the new technology economy will likely stall.

The Internet has made it possible to fund innovation in new ways like crowd funding — an innovation in finance itself — and there is no reason to think that financial institutions should be immune to disruptive innovation produced by new entrants that offer completely novel ways of saving, insuring, loaning, transferring and investing money. New approaches such as crowdfunding and other financial technology (aka “FinTech”) initiatives could provide depth and a plurality of perspectives, in order to foster innovation in financial services and in the European economy as a whole.

The time has come to integrate these financial technologies into the overall financial frameworks in a manner that does not neuter their creativity, or lower their potential to revitalize the economy. There are potential synergies with macro-prudential policies focused on mitigating systemic risk and ensuring the stability of financial systems. These platforms have great potential for cross-border lending and investment and could help to remedy the retreat of bank capital behind national borders since the financial crisis. It is time for a new perspective grounded in an “innovation-friendly” philosophy and regulatory approach to emerge.

Crowdfunding is a newcomer to the financial industry, and as such, actions (such as complex and burdensome regulatory frameworks or high levels of guaranteed compensation for losses) that could close it down or raise high barriers of entry should be avoided. Competition in the interests of the consumer and of entrepreneurs looking for funding should be encouraged. Regulators should be ready to step in if abuses do, or threaten to, arise while leaving space for new ideas around crowdfunding to gain traction rapidly, without being overburdened by regulatory requirements at an early stage.

The interests of both “financing innovation” and “innovation in the financial sector” also coincide in the FinTech entrepreneurial community. Schumpeter wrote in 1942: “[the] process of Creative Destruction is the essential fact about capitalism. It is what capitalism consists in and what every capitalist concern has got to live in.” An economy’s dynamism depends on innovative competitors challenging, and if successful, taking the place of complacent players in the markets. Keeping with the theme of Schumpeterian creative destruction, the financial sector is one seen by banking sector analysts and commentators as being particularly ripe for disruptive innovation, given its current profits and lax competition. Technology-driven disintermediation of many financial services is on the cards, for example, in financial advice, lending, investing, trading, virtual currencies and risk management.

The UK’s Financial Conduct Authority’s regulatory dialogues with FinTech developers to provide legal clarity on the status of their new initiatives are an example of good practice , as regulation in this highly monitored sector is potentially a serious barrier to entry and new innovation. The FCA also proactively addresses enabling innovation with Project Innovate, an initiative to assist both start-ups and established businesses in implementing innovative ideas in the financial services markets through an Incubator and Innovation Hub.

By its nature, FinTech is a sector that can benefit and benefit from the EU’s Digital Single Market and make Europe a sectoral global leader in this field. In evaluating possible future FinTech regulation, we need to ensure an optimal regulatory framework and specific rules. The innovation principle I discuss in my article should be part of an approach ensuring not only that regulation is clear and proportional — so that innovators can easily comply — but also ensuring that we are ready, when justified, to adapt regulation to enable innovations. Furthermore, any regulatory approaches should be “future proofed” and should not lock in today’s existing technologies, business models or processes.

Read the full article: Zilgalvis, P. (2014) The Need for an Innovation Principle in Regulatory Impact Assessment: The Case of Finance and Innovation in Europe. Policy and Internet 6 (4) 377–392.


Pēteris Zilgalvis, J.D. is a Senior Member of St Antony’s College, University of Oxford, and an Associate of its Political Economy of Financial Markets Programme. In 2013-14 he was a Senior EU Fellow at St Antony’s. He is also currently Head of Unit for eHealth and Well Being, DG CONNECT, European Commission.

]]>
Finnish decision to allow same-sex marriage “shows the power of citizen initiatives” https://ensr.oii.ox.ac.uk/finnish-decision-to-allow-same-sex-marriage-shows-the-power-of-citizen-initiatives/ Fri, 28 Nov 2014 13:45:04 +0000 http://blogs.oii.ox.ac.uk/policy/?p=3024
November rainbows in front of the Finnish parliament house in Helsinki, one hour before the vote for same-sex marriage. Photo by Anne Sairio.
November rainbows in front of the Finnish parliament house in Helsinki, one hour before the vote for same-sex marriage. Photo by Anni Sairio.

In a pivotal vote today, the Finnish parliament voted in favour of removing references to gender in the country’s marriage law, which will make it possible for same-sex couples to get married. It was predicted to be an extremely close vote, but in the end gender neutrality won with 105 votes to 92. Same-sex couples have been able to enter into registered partnerships in Finland since 2002, but this form of union lacks some of the legal and more notably symbolic privileges of marriage. Today’s decision is thus a historic milestone in the progress towards tolerance and equality before the law for all the people of Finland.

Today’s parliamentary decision is also a milestone for another reason: it is the first piece of “crowdsourced” legislation on its way to becoming law in Finland. A 2012 constitutional change made it possible for 50,000 citizens or more to propose a bill to the parliament, through a mechanism known as the citizen initiative. Citizens can develop bills on a website maintained by the Open Ministry, a government-supported citizen association. The Open Ministry aims to be the deliberative version of government ministries that do the background work for government bills. Once the text of a citizen bill is finalised, citizens can also endorse it on a website maintained by the Ministry of Justice. If a bill attracts more than 50,000 endorsements within six months, it is delivered to the parliament.

A significant reason behind the creation of the citien initiative system was to increase citizen involvement in decision making and thus enhance the legitimacy of Finland’s political system: to make people feel that they can make a difference. Finland, like most Western democracies, is suffering from dwindling voter turnout rates (though in the last parliamentary elections, domestic voter turnout was a healthy 70.5 percent). However, here lies one of the potential pitfalls of the citizen initiative system. Of the six citizen bills delivered to the parliament so far, parliamentarians have outright rejected most proposals. According to research presented by Christensen and his colleagues at our Internet, Politics & Policy conference in Oxford in September (and to be published in issue 7:1 of Policy and Internet, March 2015), there is a risk that the citizen iniative system ends up having an effect that is opposite from what was intended:

“[T]hose who supported [a crowdsourced bill rejected by the parliament] experienced a drop in political trust as a result of not achieving this outcome. This shows that political legitimacy may well decline when participants do not get the intended result (cf. Budge, 2012). Hence, if crowdsourcing legislation in Finland is to have a positive impact on political legitimacy, it is important that it can help produce popular Citizens’ initiatives that are subsequently adopted by Parliament.”

One reason why citizen initiatives have faced a rough time in the parliament is that they are a somewhat odd addition to the parliament’s existing ways of working. The Finnish parliament, like most parliaments in representative democracies, is used to working in a government-opposition arrangement, where the government proposes bills, and parliamentarians belonging to government parties are expected to support those bills and resist bills originating from the opposition. Conversely, opposition leaders expect their members to be loyal to their own initiatives. In this arrangement, citizen initiatives have fallen into a no-man’s land where they are endorsed by neither government nor opposition members. Thanks to the party whip system, their only hope of passing has been in being adopted by the government. But the whole point of citizen initiatives is that they would allow bills not proposed by the government to reach parliament, making the exercise rather pointless.

The marriage equality citizen initiative was able to break this pattern not only because it enjoyed immense popular support, but also because many parliamentarians saw marriage equality as a matter of conscience, where the party whip system wouldn’t apply. Parliamentarians across party lines voted in support and against the initiative, in many cases ignoring their party leaders’ instructions.

Prime Minister Alexander Stubb commented immediately after the vote that the outcome “shows the power of citizen initiatives”, “citizen democracy and direct democracy”. Now that a precedent has been set, it is possible that subsequent citizen initiatives, too, get judged more on their merits than on who proposed them. Today’s decision on marriage equality may thus turn out to be historic not only for advancing equality and fairness, but also for helping to define crowdsourcing’s role in Finnish parliamentary decision making.


Vili Lehdonvirta is a Research Fellow and DPhil Programme Director at the Oxford Internet Institute, and an editor of the Policy & Internet journal. He is an economic sociologist who studies the social and economic dimensions of new information technologies around the world, with particular expertise in digital markets and crowdsourcing.

]]>
Investigating virtual production networks in Sub-Saharan Africa and Southeast Asia https://ensr.oii.ox.ac.uk/investigating-virtual-production-networks-in-sub-saharan-africa-southeast-asia/ Mon, 03 Nov 2014 14:19:04 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2969 Ed: You are looking at the structures of ‘virtual production networks’ to understand the economic and social implications of online work. How are you doing this?

Mark: We are studying online freelancing. In other words this is digital or digitised work for which professional certification or formal training is usually not required. The work is monetised or monetisable, and can be mediated through an online marketplace.

Freelancing is a very old format of work. What is new is the fact that we have almost three billion people connected to a global network: many of those people are potential workers in virtual production networks. This mass connectivity has been one crucial ingredient for some significant changes in how work is organised, divided, outsourced, and rewarded. What we plan to do in this project is better map the contours of some of those changes and understand who wins and who doesn’t in this new world of work.

Ed: Are you able to define what comprises an individual contribution to a ‘virtual production network’ — or to find data on it? How do you define and measure value within these global flows and exchanges?

Mark: It is very far from easy. Much of what we are studying is immaterial and digitally-mediated work. We can find workers and we can find clients, but the links between them are often opaque and black-boxed. Some of the workers that we have spoken to operate under non-disclosure agreements, and many actually haven’t been told what their work is being used for.

But that is precisely why we felt the need to embark on this project. With a combination of quantitative transaction data from key platforms and qualitative interviews in which we attempt to piece together parts of the network, we want to understand who is (and isn’t) able to capture and create value within these networks.

Ed: You note that “within virtual production networks, are we seeing a shift in the boundaries of firms” — to what extend to you think we seeing the emergence of new forms of organisation?

Mark: There has always been a certain spatial stickiness to some activities carried out by firms (or within firms). Some activities required the complex exchanges of knowledge that were difficult to digitally mediate. But digitisation and better connectivity in low-wage countries has now allowed many formerly ‘in-house’ business processes to be outsourced to third-parties. In an age of cloud computing, cheap connectivity, and easily accessible collaboration tools, geography has become less sticky. One task that we are engaged in is looking at the ways that some kinds of tacit knowledge that are difficult to transmit digitally offer some people and firms (in different places) competitive advantages and disadvantages.

This proliferation of digitally mediated work could also be seen as a new form of organisation. The organisations that control key work marketplaces (like oDesk) make decisions that shape both who buyers and sellers are able to connect with, and the ways in which they are able to transact.

Ed: Does ‘virtual work’ add social or economic value to individuals in low-income countries? ie are we really dealing with a disintermediated, level surface on a global playing field, or just a different form of old exploitation (ie a virtual rather than physical extraction industry)?

Mark: That is what we aim to find out. Many have pointed to the potentials of online freelancing to create jobs and bring income to workers in low-income countries. But many others have argued that such practices are creating ‘digital sweatshops’ and facilitating a race to the bottom.

We undoubtedly are not seeing a purely disintermediated market, or a global playing field. But what we want to understand is who exactly benefits from these new networks of work, and how.

Ed: Will you be doing any network analysis of the data you collect, ie of actual value-flows? And will they be geolocated networks?

Mark: Yes! I am actually preparing a post that contains a geographic network of all work conducted over the course of a month via oDesk (see the website of the OII’s Connectivity, Inclusion, and Inequality Group for more..).

Mark Graham was talking to blog editor David Sutcliffe.


Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

]]>
What explains the worldwide patterns in user-generated geographical content? https://ensr.oii.ox.ac.uk/what-explains-the-worldwide-patterns-in-user-generated-geographical-content/ Mon, 08 Sep 2014 07:20:05 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2908 The geographies of codified knowledge have always been uneven, affording some people and places greater voice and visibility than others. While the rise of the geosocial Web seemed to promise a greater diversity of voices, opinions, and narratives about places, many regions remain largely absent from the websites and services that represent them to the rest of the world. These highly uneven geographies of codified information matter because they shape what is known and what can be known. As geographic content and geospatial information becomes increasingly integral to our everyday lives, places that are left off the ‘map of knowledge’ will be absent from our understanding of, and interaction with, the world.

We know that Wikipedia is important to the construction of geographical imaginations of place, and that it has immense power to augment our spatial understandings and interactions (Graham et al. 2013). In other words, the presences and absences in Wikipedia matter. If a person’s primary free source of information about the world is the Persian or Arabic or Hebrew Wikipedia, then the world will look fundamentally different from the world presented through the lens of the English Wikipedia. The capacity to represent oneself to outsiders is especially important in those parts of the world that are characterized by highly uneven power relationships: Brunn and Wilson (2013) and Graham and Zook (2013) have already demonstrated the power of geospatial content to reinforce power in a South African township and Jerusalem, respectively.

Until now, there has been no large-scale empirical analysis of the factors that explain information geographies at the global scale; this is something we have aimed to address in this research project on Mapping and measuring local knowledge production and representation in the Middle East and North Africa. Using regression models of geolocated Wikipedia data we have identified what are likely to be the necessary conditions for representation at the country level, and have also identified the outliers, i.e. those countries that fare considerably better or worse than expected. We found that a large part of the variation could be explained by just three factors: namely, (1) country population, (2) availability of broadband Internet, and (3) the number of edits originating in that country. [See the full paper for an explanation of the data and the regression models.]

But how do we explain the significant inequalities in the geography of user-generated information that remain after adjusting for differing conditions using our regression model? While these three variables help to explain the sparse amount of content written about much of Sub-Saharan Africa, most of the Middle East and North Africa have quantities of geographic information below their expected values. For example, despite high levels of wealth and connectivity, Qatar and the United Arab Emirates have far fewer articles than we might expect from the model.

These three factors independently matter, but they will also be subject to a number of constraints. A country’s population will probably affect the number of human sites, activities, and practices of interest; ie the number of things one might want to write about. The size of the potential audience might also be influential, encouraging editors in denser-populated regions and those writing in major languages. However, societal attitudes towards learning and information sharing will probably also affect the propensity of people in some places to contribute content. Factors discouraging the number of edits to local content might include a lack of local Wikimedia chapters, the attractiveness of writing content about other (better-represented) places, or contentious disputes in local editing communities that divert time into edit wars and away from content generation.

We might also be seeing a principle of increasing informational poverty. Not only is a broader base of traditional source material (such as books, maps, and images) needed for the generation of any Wikipedia article, but it is likely that the very presence of content itself is a generative factor behind the production of further content. This makes information produced about information-sparse regions most useful for people in informational cores — who are used to integrating digital information into their everyday practices — rather than those in informational peripheries.

Various practices and procedures of Wikipedia editing likely amplify this effect. There are strict guidelines on how knowledge can be created and represented in Wikipedia, including a ban on original research, and the need to source key assertions. Editing incentives and constraints probably also encourage work around existing content (which is relatively straightforward to edit) rather than creation of entirely new material. In other words, the very policies and norms that govern the encyclopedia’s structure make it difficult to populate the white space with new geographic content. In addressing these patterns of increasing informational poverty, we need to recognize that no one of these three conditions can ever be sufficient for the generation of geographic knowledge. As well as highlighting the presences and absences in user-generated content, we also need to ask what factors encourage or limit production of that content.

In interpreting our model, we have come to a stark conclusion: increasing representation doesn’t occur in a linear fashion, but it accelerates in a virtuous cycle, benefitting those with strong editing cultures in local languages. For example, Britain, Sweden, Japan and Germany are extensively georeferenced on Wikipedia, whereas much of the MENA region has not kept pace, even accounting for their levels of connectivity, population, and editors. Thus, while some countries are experiencing the virtuous cycle of more edits and broadband begetting more georeferenced content, those on the periphery of these information geographies might fail to reach a critical mass of editors, or even dismiss Wikipedia as a legitimate site for user-generated geographic content: a problem that will need to be addressed if Wikipedia is indeed to be considered as the “sum of all human knowledge”.

Read the full paper: Graham, M., Hogan, B., Straumann, R.K., and Medhat, A. (2014) Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers.

References

Brunn S. D., and M. W. Wilson. 2013. Cape Town’s million plus black township of Khayelitsha: Terrae incognitae and the geographies and cartographies of silence, Habitat International. 39 284-294.

Graham M., and M. Zook. (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45(1): 77–99.

Graham M, M. Zook, and A. Boulton. 2013. Augmented Reality in the Urban Environment: Contested Content and the Duplicity of Code. Transactions of the Institute of British Geographers. 38(3) 464-479.


Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

]]>
What is stopping greater representation of the MENA region? https://ensr.oii.ox.ac.uk/what-is-stopping-greater-representation-of-the-mena-region/ Wed, 06 Aug 2014 08:35:52 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2575 Caption
Negotiating the wider politics of Wikipedia can be a daunting task, particularly when in it comes to content about the MENA region. Image of the Dome of the Rock (Qubbat As-Sakhrah), Jerusalem, by 1yen

Wikipedia has famously been described as a project that “ works great in practice and terrible in theory”. One of the ways in which it succeeds is through its extensive consensus-based governance structure. While this has led to spectacular success –over 4.5 million articles in the English Wikipedia alone — the governance structure is neither obvious nor immediately accessible, and can present a barrier for those seeking entry. Editing Wikipedia can be a tough challenge – an often draining and frustrating task, involving heated disputes and arguments where it is often the most tenacious, belligerent, or connected editor who wins out in the end.

Broadband access and literacy are not the only pre-conditions for editing Wikipedia; ‘digital literacy’ is also crucial. This includes the ability to obtain and critically evaluate online sources, locate Wikipedia’s editorial and governance policies, master Wiki syntax, and confidently articulate and assert one’s views about an article or topic. Experienced editors know how to negotiate the rules, build a consensus with some editors to block others, and how to influence administrators during dispute resolution. This strict adherence to the word (if not the spirit) of Wikipedia’s ‘law’ can lead to marginalization or exclusion of particular content, particularly when editors are scared off by unruly mobs who ‘weaponize’ policies to fit a specific agenda.

Governing such a vast collaborative platform as Wikipedia obviously presents a difficult balancing act between being open enough to attract volume of contributions, and moderated enough to ensure their quality. Many editors consider Wikipedia’s governance structure (which varies significantly between the different language versions) essential to ensuring the quality of its content, even if it means that certain editors can (for example) arbitrarily ban other users, lock down certain articles, and exclude moderate points of view. One of the editors we spoke to noted that: “A number of articles I have edited with quality sources, have been subjected to editors cutting information that doesn’t fit their ideas […] I spend a lot of time going back to reinstate information. Today’s examples are in the ‘Battle of Nablus (1918)’ and the ‘Third Transjordan attack’ articles. Bullying does occur from time to time […] Having tried the disputes process I wouldn’t recommend it.” Community building might help support MENA editors faced with discouragement or direct opposition as they try to build content about the region, but easily locatable translations of governance materials would also help. Few of the extensive Wikipedia policy discussions have been translated into Arabic, leading to replication of discussions or ambiguity surrounding correct dispute resolution.

Beyond arguments with fractious editors over minutiae (something that comes with the platform), negotiating the wider politics of Wikipedia can be a daunting task, particularly when in it comes to content about the MENA region. It would be an understatement to say that the Middle East is a politically sensitive region, with more than its fair share of apparently unresolvable disputes, competing ideologies (it’s the birthplace of three world religions…), repressive governments, and ongoing and bloody conflicts. Editors shared stories with us about meddling from state actors (eg Tunisia, Iran) and a lack of trust with a platform that is generally considered to be a foreign, and sometimes explicitly American, tool. Rumors abound that several states (eg Israel, Iran) have concerted efforts to work on Wikipedia content, creating a chilling effect for new editors who might feel that editing certain pages might prove dangerous, or simply frustrating or impossible. Some editors spoke of being asked by Syrian government officials for advice on how to remove critical content, or how to identify the editors responsible for putting it there. Again: the effect is chilling.

A lack of locally produced and edited content about the region clearly can’t be blamed entirely on ‘outsiders’. Many editors in the Arabic Wikipedia have felt snubbed by the creation of an explicitly “Egyptian Arabic” Wikipedia, which has not only forked the content and editorial effort, but also stymied any ‘pan-Arab’ identity on the platform. There is a culture of administrators deleting articles they do not think are locally appropriate; often relating to politically (or culturally) sensitive topics. Due to Arabic Wikipedia’s often vicious edit wars, it is heavily moderated (unlike for example the English version), and anonymous edits do not appear instantly.

Some editors at the workshops noted other systemic and cultural issues, for example complaining of an education system that encourages rote learning, reinforcing the notion that only experts should edit (or moderate) a topic, rather than amateurs with local familiarity. Editors also noted the notable gender disparities on the site; a longstanding issue for other Wikipedia versions as well. None of these discouragements are helped by what some editors noted as a larger ‘image problem’ with editing in the Arabic Wikipedia, given it would always be overshadowed by the dominant English Wikipedia, one editor commenting that: “the English Wikipedia is vastly larger than its Arabic counterpart, so it is not unthinkable that there is more content, even about Arab-world subjects, in English. From my (unscientific) observation, many times, content in Arabic about a place or a tribe is not very encyclopedic, but promotional, and lacks citations”. Translating articles into Arabic might be seen as menial and unrewarding work, when the exciting debates about an article are happening elsewhere.

When we consider the coming-together of all of these barriers, it might be surprising that Wikipedia is actually as large as it is. However, the editors we spoke with were generally optimistic about the site, considering it an important activity that serves the greater good. Wikipedia is without doubt one of the most significant cultural and political forces on the Internet. Wikipedians are remarkably generous with their time, and it’s their efforts that are helping to document, record, and represent much of the world – including places where documentation is scarce. Most of the editors at our workshop ultimately considered Wikipedia a path to a more just society; through not just consensus, voting, and an aspiration to record certain truths — seeing it not just as a site of conflict, but also a site of regional (and local) pride. When asked why he writes geographic content, one editor simply replied: “It’s my own town”.


Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

]]>
How well represented is the MENA region in Wikipedia? https://ensr.oii.ox.ac.uk/how-well-represented-is-the-mena-region-in-wikipedia/ Tue, 22 Jul 2014 08:13:02 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2811
There are more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East. Image of rock paintings in the Tadrart Acacus region of Libya by Luca Galuzzi.
There are more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East. Image of rock paintings in the Tadrart Acacus region of Libya by Luca Galuzzi.
Wikipedia is often seen to be both an enabler and an equalizer. Every day hundreds of thousands of people collaborate on an (encyclopaedic) range of topics; writing, editing and discussing articles, and uploading images and video content. This structural openness combined with Wikipedia’s tremendous visibility has led some commentators to highlight it as “a technology to equalize the opportunity that people have to access and participate in the construction of knowledge and culture, regardless of their geographic placing” (Lessig 2003). However, despite Wikipedia’s openness, there are also fears that the platform is simply reproducing worldviews and knowledge created in the Global North at the expense of Southern viewpoints (Graham 2011; Ford 2011). Indeed, there are indications that global coverage in the encyclopaedia is far from ‘equal’, with some parts of the world heavily represented on the platform, and others largely left out (Hecht and Gergle 2009; Graham 2011, 2013, 2014).

These second-generation digital divides are not merely divides of Internet access (so discussed in the late 1990s), but gaps in representation and participation (Hargittai and Walejko 2008). Whereas most Wikipedia articles written about most European and East Asian countries are written in their dominant languages, for much of the Global South we see a dominance of articles written in English. These geographic differences in the coverage of different language versions of Wikipedia matter, because fundamentally different narratives can be (and are) created about places and topics in different languages (Graham and Zook 2013; Graham 2014).

If we undertake a ‘global analysis’ of this pattern by examining the number of geocoded articles (ie about a specific place) across Wikipedia’s main language versions (Figure 1), the first thing we can observe is the incredible human effort that has gone into describing ‘place’ in Wikipedia. The second is the clear and highly uneven geography of information, with Europe and North America home to 84% of all geolocated articles. Almost all of Africa is poorly represented in the encyclopaedia — remarkably, there are more Wikipedia articles written about Antarctica (14,959) than any country in Africa, and more geotagged articles relating to Japan (94,022) than the entire MENA region (88,342). In Figure 2 it is even more obvious that Europe and North America lead in terms of representation on Wikipedia.

Figure 1. Total number of geotagged Wikipedia articles across all 44 surveyed languages.
Figure 1. Total number of geotagged Wikipedia articles across all 44 surveyed languages.
Figure 2. Number of regional geotagged articles and population.
Figure 2. Number of regional geotagged articles and population.

Knowing how many articles describe a place only tells a part of the ‘representation story’. Figure 3 adds the linguistic element, showing the dominant language of Wikipedia articles per country. The broad pattern is that some countries largely define themselves in their own languages, and others appear to be largely defined from outside. For instance, almost all European countries have more articles about themselves in their dominant language; that is, most articles about the Czech Republic are written in Czech. Most articles about Germany are written in German (not English).

Figure 3. Language with the most geocoded articles by country (across 44 top languages on Wikipedia).
Figure 3. Language with the most geocoded articles by country (across 44 top languages on Wikipedia).

We do not see this pattern across much of the South, where English dominates across much of Africa, the Middle East, South and East Asia, and even parts of South and Central America. French dominates in five African countries, and German is dominant in one former German colony (Namibia) and a few other countries (e.g. Uruguay, Bolivia, East Timor).

The scale of these differences is striking. Not only are there more Wikipedia articles in English than Arabic about almost every Arabic speaking country in the Middle East, but there are more English articles about North Korea than there are Arabic articles about Saudi Arabia, Libya, and the UAE. Not only do we see most of the world’s content written about global cores, but it is largely dominated by a relatively few languages.

Figure 4 shows the total number of geotagged Wikipedia articles in English per country. The sheer density of this layer of information over some parts of the world is astounding (with 928,542 articles about places in English), nonetheless, in this layer of geotagged English content, only 3.23% of the articles are about Africa, and 1.67% are about the MENA region.

Figure 4. Number of geotagged articles in the English Wikipedia by country.
Figure 4. Number of geotagged articles in the English Wikipedia by country.

We see a somewhat different pattern when looking at the global geography of the 22,548 geotagged articles of the Arabic Wikipedia (Figure 5). Algeria and Syria are both defined by a relatively high number of articles in Arabic (as are the US, Italy, Spain, Russia and Greece). These information densities are substantially greater than what we see for many other MENA countries in which Arabic is an official language (such as Egypt, Morocco, and Saudi Arabia). This is even more surprising when we realise that the Italian and Spanish populations are smaller than the Egyptian, but there are nonetheless far more geotagged articles in Arabic about Italy (2,428) and Spain (1,988) than about Egypt (433).

Figure 5. Total number of geotagged articles in the Arabic Wikipedia by country.
Figure 5. Total number of geotagged articles in the Arabic Wikipedia by country.

By mapping the geography of Wikipedia articles in both global and regional languages, we can begin to examine the layers of representation that ‘augment’ the world we live in. We have seen that, notable exceptions aside (e.g. ‘Iran’ in Farsi and ‘Israel’ in Hebrew) the MENA region tends to be massively underrepresented — not just in major world languages, but also in its own: Arabic. Clearly, much is being left unsaid about that part of the world. Although we entered the project anticipating that the MENA region would be under-represented in English, we did not anticipate the degree to which it is under-represented in Arabic.

References

Ford, H. (2011) The Missing Wikipedians. In Critical Point of View: A Wikipedia Reader, ed. G. Lovink and N. Tkacz, 258-268. Amsterdam: Institute of Network Cultures.

Graham, M. (2014) The Knowledge Based Economy and Digital Divisions of Labour. In Companion to Development Studies, 3rd edition, eds v. Desai, and R. Potter. Hodder, pp. 189-195.

Graham, M. (2013) The Virtual Dimension. In Global City Challenges: Debating a Concept, Improving the Practice. Eds. Acuto, M. and Steele, W. London: Palgrave.

Graham, M. (2011) Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, pp. 269-282.

Graham M., and M. Zook (2013) Augmented Realities and Uneven Geographies: Exploring the Geolinguistic Contours of the Web. Environment and Planning A 45 (1) 77–99.

Hargittai, E. and G. Walejko (2008) The Participation Divide: Content Creation and Sharing in the Digital Age. Information, Communication and Society 11 (2) 239–256.

Hecht B., and D. Gergle (2009) Measuring self-focus bias in community-maintained knowledge repositories. In Proceedings of the 4th International Conference on Communities and Technologies, Penn State University, 2009, pp. 11–20. New York: ACM.

Lessig, L. (2003) An Information Society: Free or Feudal. Talk given at the World Summit on the Information Society, Geneva, 2003.


Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

]]>
The sum of (some) human knowledge: Wikipedia and representation in the Arab World https://ensr.oii.ox.ac.uk/the-sum-of-some-human-knowledge-wikipedia-and-representation-in-the-arab-world/ Mon, 14 Jul 2014 09:00:14 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2555 Caption
Arabic is one of the least represented major world languages on Wikipedia: few languages have more speakers and fewer articles than Arabic. Image of the Umayyad Mosque (Damascus) by Travel Aficionado

Wikipedia currently contains over 9 million articles in 272 languages, far surpassing any other publicly available information repository. Being the first point of contact for most general topics (therefore an effective site for framing any subsequent representations) it is an important platform from which we can learn whether the Internet facilitates increased open participation across cultures — or reinforces existing global hierarchies and power dynamics. Because the underlying political, geographic and social structures of Wikipedia are hidden from users, and because there have not been any large scale studies of the geography of these structures and their relationship to online participation, entire groups of people (and regions) may be marginalized without their knowledge.

This process is important to understand, for the simple reason that Wikipedia content has begun to form a central part of services offered elsewhere on the Internet. When you look for information about a place on Facebook, the description of that place (including its geographic coordinates) comes from Wikipedia. If you want to “check in” to a museum in Doha to signify you were there to their friends, the place you check in to was created with Wikipedia data. When you Google “House of Saud” you are presented not only with a list of links (with Wikipedia at the top) but also with a special ‘card’ summarising the House. This data comes from Wikipedia. When you look for people or places, Google now has these terms inside its ‘knowledge graph’, a network of related concepts with data coming directly from Wikipedia. Similarly, on Google maps, Wikipedia descriptions for landmarks are presented as part of the default information.

Ironically, Wikipedia editorship is actually on a slow and steady decline, even as its content and readership increases year on year. Since 2007 and the introduction of significant devolution of administrative powers to volunteers, Wikipedia has not been able to effectively retain newcomers, something which has been noted as a concern by many at the Wikimedia Foundation. Some think Wikipedia might be levelling off because there’s only so much to write about. This is extremely far from the truth; there are still substantial gaps in geographic content in English and overwhelming gaps in other languages. Wikipedia often brands itself as aspiring to contain “the sum of human knowledge”, but behind this mantra lie policy pitfalls, tedious editor debates and delicate sourcing issues that hamper greater representation of the region. Of course these challenges form part of Wikipedia’s continuing evolution as the de facto source for online reference information, but they also (disturbingly) act to entrench particular ways of “knowing” — and ways of validating what is known.

There are over 260,000 articles in Arabic, receiving 240,000 views per hour. This actually translates as one of the least represented major world languages on Wikipedia: few languages have more speakers and fewer articles than Arabic. This relative lack of MENA voice and representation means that the tone and content of this globally useful resource, in many cases, is being determined by outsiders with a potential misunderstanding of the significance of local events, sites of interest and historical figures. In an area that has seen substantial social conflict and political upheaval, greater participation from local actors would help to ensure balance in content about contentious issues. Unfortunately, most research on MENA’s Internet presence has so far been drawn from anecdotal evidence, and no comprehensive studies currently exist.

In this project we wanted to understand where place-based content comes from, to explain reasons for the relative lack of Wikipedia articles in Arabic and about the MENA region, and to understand which parts of the region are particularly underrepresented. We also wanted to understand the relationship between Wikipedia’s administrative structure and the treatment of new editors; in particular, we wanted to know whether editors from the MENA region have less of a voice than their counterparts from elsewhere, and whether the content they create is considered more or less legitimate, as measured through the number of reverts; ie the overriding of their work by other editors.

Our practical objectives involved a consolidation of Middle Eastern Wikipedians though a number of workshops focusing on how to create more equitable and representative content, with the ultimate goal of making Wikipedia a more generative and productive site for reference information about the region. Capacity building among key Wikipedians can create greater understanding of barriers to participation and representation and offset much of the (often considerable) emotional labour required to sustain activity on the site in the face of intense arguments and ideological biases. Potential systematic structures of exclusion that could be a barrier to participation include such competitive practices as content deletion, indifference to content produced by MENA authors, and marginalization through bullying and dismissal.

However, a distinct lack of sources — owing both to a lack of legitimacy for MENA journalism and a paucity of open access government documents — is also inhibiting further growth of content about the region. When inclusion of a topic is contested by editors it is typically because there is not enough external source material about it to establish “notability”. As Ford (2011) has already discussed, notability is often culturally mediated. For example, a story in Al Jazeera would not have been considered a sufficient criterion of notability a couple of years ago. However, this has changed dramatically since its central role in reporting on the Arab Spring.

Unfortunately, notability can create a feedback loop. If an area of the world is underreported, there are no sources. If there are no sources, then journalists do not always have enough information to report about that part of the world. ‘Correct’ sourcing trumps personal experience on Wikipedia; even if an author is from a place, and is watching a building being destroyed, their Wikipedia edit will not be accepted by the community unless the event is discussed in another ‘official’ medium. Often the edit will either be branded with a ‘citation needed’ tag, eliminated, or discussed in the talk page. Particularly aggressive editors and administrators will nominate the page for ‘speedy deletion’ (ie deletion without discussion), a practice that makes responses from an author difficult

Why does any of this matter in practical terms? For the simple reason that biases, absences and contestations on Wikipedia spill over into numerous other domains that are in regular and everyday use (Graham and Zook, 2013). If a place is not on Wikipedia, this might have a chilling effect on business and stifle journalism; if a place is represented poorly on Wikipedia this can lead to misunderstandings about the place. Wikipedia is not a legislative body. However, in the court of public opinion, Wikipedia represents one of the world’s strongest forces, as it quietly inserts itself into representations of place worldwide (Graham et. al 2013; Graham 2013).

Wikipedia is not merely a site of reference information, but is rapidly becoming the de facto site for representing the world to itself. We need to understand more about that representation.

Further Reading

Allagui, I., Graham, M., and Hogan, B. 2014. Wikipedia Arabe et la Construction Collective du Savoir In Wikipedia, objet scientifique non identifie. eds. Barbe, L., and Merzeau, L. Paris: Presses Universitaries du Paris Ouest (in press).

Graham, M., Hogan, B., Straumann, R. K., and Medhat, A. 2014. Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers (forthcoming).

Graham, M. 2012. Die Welt in Der Wikipedia Als Politik der Exklusion: Palimpseste des Ortes und selective Darstellung. In Wikipedia. eds. S. Lampe, and P. Bäumer. Bundeszentrale für politische Bildung/bpb, Bonn.

Graham, M. 2011. Wiki Space: Palimpsests and the Politics of Exclusion. In Critical Point of View: A Wikipedia Reader. Eds. Lovink, G. and Tkacz, N. Amsterdam: Institute of Network Cultures, 269-282.

References

Ford, H. (2011) The Missing Wikipedians. In Geert Lovink and Nathaniel Tkacz (eds), Critical Point of View: A Wikipedia Reader, Amsterdam: Institute of Network Cultures, 2011. ISBN: 978-90-78146-13-1.

Graham, M., M. Zook., and A. Boulton. 2013. Augmented Reality in the Urban Environment: contested content and the duplicity of code. Transactions of the Institute of British Geographers. 38(3), 464-479.

Graham, M and M. Zook. 2013. Augmented Realities and Uneven Geographies: Exploring the Geo-linguistic Contours of the Web. Environment and Planning A 45(1) 77-99.

Graham, M. 2013. The Virtual Dimension. In Global City Challenges: debating a concept, improving the practice. eds. M. Acuto and W. Steele. London: Palgrave. 117-139.


Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

]]>
Past and Emerging Themes in Policy and Internet Studies https://ensr.oii.ox.ac.uk/past-and-emerging-themes-in-policy-and-internet-studies/ Mon, 12 May 2014 09:24:59 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2673 Caption
We can’t understand, analyze or make public policy without understanding the technological, social and economic shifts associated with the Internet. Image from the (post-PRISM) “Stop Watching Us” Berlin Demonstration (2013) by mw238.

In the journal’s inaugural issue, founding Editor-in-Chief Helen Margetts outlined what are essentially two central premises behind Policy & Internet’s launch. The first is that “we cannot understand, analyze or make public policy without understanding the technological, social and economic shifts associated with the Internet” (Margetts 2009, 1). It is simply not possible to consider public policy today without some regard for the intertwining of information technologies with everyday life and society. The second premise is that the rise of the Internet is associated with shifts in how policy itself is made. In particular, she proposed that impacts of Internet adoption would be felt in the tools through which policies are effected, and the values that policy processes embody.

The purpose of the Policy and Internet journal was to take up these two challenges: the public policy implications of Internet-related social change, and Internet-related changes in policy processes themselves. In recognition of the inherently multi-disciplinary nature of policy research, the journal is designed to act as a meeting place for all kinds of disciplinary and methodological approaches. Helen predicted that methodological approaches based on large-scale transactional data, network analysis, and experimentation would turn out to be particularly important for policy and Internet studies. Driving the advancement of these methods was therefore the journal’s third purpose. Today, the journal has reached a significant milestone: over one hundred high-quality peer-reviewed articles published. This seems an opportune moment to take stock of what kind of research we have published in practice, and see how it stacks up against the original vision.

At the most general level, the journal’s articles fall into three broad categories: the Internet and public policy (48 articles), the Internet and policy processes (51 articles), and discussion of novel methodologies (10 articles). The first of these categories, “the Internet and public policy,” can be further broken down into a number of subcategories. One of the most prominent of these streams is fundamental rights in a mediated society (11 articles), which focuses particularly on privacy and freedom of expression. Related streams are children and child protection (six articles), copyright and piracy (five articles), and general e-commerce regulation (six articles), including taxation. A recently emerged stream in the journal is hate speech and cybersecurity (four articles). Of course, an enduring research stream is Internet governance, or the regulation of technical infrastructures and economic institutions that constitute the material basis of the Internet (seven articles). In recent years, the research agenda in this stream has been influenced by national policy debates around broadband market competition and network neutrality (Hahn and Singer 2013). Another enduring stream deals with the Internet and public health (eight articles).

Looking specifically at “the Internet and policy processes” category, the largest stream is e-participation, or the role of the Internet in engaging citizens in national and local government policy processes, through methods such as online deliberation, petition platforms, and voting advice applications (18 articles). Two other streams are e-government, or the use of Internet technologies for government service provision (seven articles), and e-politics, or the use of the Internet in mainstream politics, such as election campaigning and communications of the political elite (nine articles). Another stream that has gained pace during recent years, is online collective action, or the role of the Internet in activism, ‘clicktivism,’ and protest campaigns (16 articles). Last year the journal published a special issue on online collective action (Calderaro and Kavada 2013), and the next forthcoming issue includes an invited article on digital civics by Ethan Zuckerman, director of MIT’s Center for Civic Media, with commentary from prominent scholars of Internet activism. A trajectory discernible in this stream over the years is a movement from discussing mere potentials towards analyzing real impacts—including critical analyses of the sometimes inflated expectations and “democracy bubbles” created by digital media (Shulman 2009; Karpf 2012; Bryer 2012).

The final category, discussion of novel methodologies, consists of articles that develop, analyze, and reflect critically on methodological innovations in policy and Internet studies. Empirical articles published in the journal have made use of a wide range of conventional and novel research methods, from interviews and surveys to automated content analysis and advanced network analysis methods. But of those articles where methodology is the topic rather than merely the tool, the majority deal with so-called “big data,” or the use of large-scale transactional data sources in research, commerce, and evidence-based public policy (nine articles). The journal recently devoted a special issue to the potentials and pitfalls of big data for public policy (Margetts and Sutcliffe 2013), based on selected contributions to the journal’s 2012 big data conference: Big Data, Big Challenges? In general, the notion of data science and public policy is a growing research theme.

This brief analysis suggests that research published in the journal over the last five years has indeed followed the broad contours of the original vision. The two challenges, namely policy implications of Internet-related social change and Internet-related changes in policy processes, have both been addressed. In particular, research has addressed the implications of the Internet’s increasing role in social and political life. The journal has also furthered the development of new methodologies, especially the use of online network analysis techniques and large-scale transactional data sources (aka ‘big data’).

As expected, authors from a wide range of disciplines have contributed their perspectives to the journal, and engaged with other disciplines, while retaining the rigor of their own specialisms. The geographic scope of the contributions has been truly global, with authors and research contexts from six continents. I am also pleased to note that a characteristic common to all the published articles is polish; this is no doubt in part due to the high level of editorial support that the journal is able to afford to authors, including copyediting. The justifications for the journal’s establishment five years ago have clearly been borne out, so that the journal now performs an important function in fostering and bringing together research on the public policy implications of an increasingly Internet-mediated society.

And what of my own research interests as an editor? In the inaugural editorial, Helen Margetts highlighted work, finance, exchange, and economic themes in general as being among the prominent areas of Internet-related social change that are likely to have significant future policy implications. I think for the most part, these implications remain to be addressed, and this is an area that the journal can encourage authors to tackle better. As an editor, I will work to direct attention to this opportunity, and welcome manuscript submissions on all aspects of Internet-enabled economic change and its policy implications. This work will be kickstarted by the journal’s 2014 conference (26-27 September), which this year focuses on crowdsourcing and online labor.

Our published articles will continue to be highlighted here in the journal’s blog. Launched last year, we believe this blog will help to expand the reach and impact of research published in Policy and Internet to the wider academic and practitioner communities, promote discussion, and increase authors’ citations. After all, publication is only the start of an article’s public life: we want people reading, debating, citing, and offering responses to the research that we, and our excellent reviewers, feel is important, and worth publishing.

Read the full editorial:  Lehdonvirta, V. (2014) Past and Emerging Themes in Policy and Internet Studies. Policy & Internet 6(2): 109-114.

References

Bryer, T.A. (2011) Online Public Engagement in the Obama Administration: Building a Democracy Bubble? Policy & Internet 3 (4).

Calderaro, A. and Kavada, A. (2013) Challenges and Opportunities of Online Collective Action for Policy Change. Policy & Internet (5) 1.

Hahn, R. and Singer, H. (2013) Is the U.S. Government’s Internet Policy Broken? Policy & Internet 5 (3) 340-363.

Karpf, D. (2012) Online Political Mobilization from the Advocacy Group’s Perspective: Looking Beyond Clicktivism. Policy & Internet 2 (4) 7-41.

Margetts, H. (2009) The Internet and Public Policy. Policy and Internet 1 (1).

Margetts, H. and Sutcliffe, D. (2013) Addressing the Policy Challenges and Opportunities of ‘Big Data.’ Policy & Internet 5 (2) 139-146.

Shulman, S.W. (2009) The Case Against Mass E-mails: Perverse Incentives and Low Quality Public Participation in U.S. Federal Rulemaking. Policy & Internet 1 (1) 23-53.

]]>
The social economies of networked cultural production (or, how to make a movie with complete strangers) https://ensr.oii.ox.ac.uk/the-social-economics-of-networked-cultural-production-or-how-to-make-a-movie-with-complete-strangers/ Mon, 28 Apr 2014 13:33:31 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2643 caption
Nomad, the perky-looking Mars rover from the crowdsourced documentary Solar System 3D (Wreckamovie).

Ed: You have been looking at “networked cultural production” — ie the creation of cultural goods like films through crowdsourcing platforms — specifically in the ‘wreckamovie’ community. What is wreckamovie?

Isis: Wreckamovie is an open online platform that is designed to facilitate collaborate film production. The main advantage of the platform is that it encourages a granular and modular approach to cultural production; this means that the whole process is broken down into small, specific tasks. In doing so, it allows a diverse range of geographically dispersed, self-selected members to contribute in accordance with their expertise, interests and skills. The platform was launched by a group of young Finnish filmmakers in 2008, having successfully produced films with the aid of an online forum since the late 1990s. Officially, there are more than 11,000 Wreckamovie members, but the active core, the community, consists of fewer than 300 individuals.

Ed: You mentioned a tendency in the literature to regard production systems as being either ‘market driven’ (eg Hollywood) or ‘not market driven’ (eg open or crowdsourced things); is that a distinction you recognised in your research?

Isis: There’s been a lot of talk about the disruptive and transformative powers nested in networked technologies, and most often Wikipedia or open source software are highlighted as examples of new production models, denoting a discontinuity from established practices of the cultural industries. Typically, the production models are discriminated based on their relation to the market: are they market-driven or fuelled by virtues such as sharing and collaboration? This way of explaining differences in cultural production isn’t just present in contemporary literature dealing with networked phenomena, though. For example, the sociologist Bourdieu equally theorized cultural production by drawing this distinction between market and non-market production, portraying the irreconcilable differences in their underlying value systems, as proposed in his The Rules of Art. However, one of the key findings of my research is that the shaping force of these productions is constituted by the tensions that arise in an antagonistic interplay between the values of social networked production and the production models of the traditional film industry. That is to say, the production practices and trajectories are equally shaped by the values embedded in peer production virtues and the conventions and drivers of Hollywood.

Ed: There has also been a tendency to regard the participants of these platforms as being either ‘professional’ or ‘amateur’ — again, is this a useful distinction in practice?

Isis: I think it’s important we move away from these binaries in order to understand contemporary networked cultural production. The notion of the blurring of boundaries between amateurs and professionals, and associated concepts such as user-generated content, peer production, and co-creation, are fine for pointing to very broad trends and changes in the constellations of cultural production. But if we want to move beyond that, towards explanatory models, we need a more fine-tuned categorisation of cultural workers. Based on my ethnographic research in the Wreckamovie community, I have proposed a typology of crowdsourcing labour, consisting of five distinct orientations. Rather than a priori definitions, the orientations are defined based on the individual production members’ interaction patterns, motivations and interpretation of the conventions guiding the division of labour in cultural production.

Ed: You mentioned that the social capital of participants involved in crowdsourcing efforts is increasingly quantifiable, malleable, and convertible: can you elaborate on this?

Isis: A defining feature of the online environment, in particular social media platforms, is its quantification of participation in the form of lists of followers, view counts, likes and so on. Across the Wreckamovie films I researched, there was a pronounced implicit understanding amongst production leaders of the exchange value of social capital accrued across the extended production networks beyond the Wreckamovie platform (e.g. Facebook, Twitter, YouTube). The quantified nature of social capital in the socio-technical space of the information economy was experienced as a convertible currency; for example, when social capital was used to drive YouTube views (which in turn constituted symbolic capital when employed as a bargaining tool in negotiating distribution deals). For some productions, these conversion mechanisms enabled increased artistic autonomy.

Ed: You also noted that we need to understand exactly where value is generated on these platforms to understand if some systems of ‘open/crowd’ production might be exploitative. How do we determine what constitutes exploitation?

Isis: The question of exploitation in the context of voluntary cultural work is an extremely complex matter, and remains an unresolved debate. I argue that it must be determined partially by examining the flow of value across the entire production networks, paying attention to nodes on both micro and macro level. Equally, we need to acknowledge the diverse forms of value that volunteers might gain in the form of, for example, embodied cultural or symbolic capital, and assess how this corresponds to their motivation and work orientation. In other words, this isn’t a question about ownership or financial compensation alone.

Ed: There were many movie-failures on the platform; but movies are obviously tremendously costly and complicated undertakings, so we would probably expect that. Was there anything in common between them, or any lessons to be learned form the projects that didn’t succeed?

Isis: You’ll find that the majority of productions on Wreckamovie are virtual ghosts; created on a whim with the expectation that production members will flock to take part and contribute. The projects that succeed in creating actual cultural goods (such as the 2010 movie Snowblind) were those that were lead by engaged producers actively promoting the building of genuine social relationships amongst members, and providing feedback to submitted content in a constructive and supportive manner to facilitate learning. The production periods of the movies I researched spanned between two and six years – it requires real dedication! Crowdsourcing does not make productions magically happen overnight.

Ed: Crowdsourcing is obviously pretty new and exciting, but are the economics (whether monetary, social or political) of these platforms really understood or properly theorised? ie is this an area where there genuinely does need to be ‘more work’?

Isis: The economies of networked cultural production are under-theorised; this is partially an outcome of the dichotomous framing of market vs. non-market led production. When conceptualized as divorced from market-oriented production, networked phenomena are most often approached through the scope of gift exchanges (in a somewhat uninformed manner). I believe Bourdieu’s concepts of alternative capital in their various guises can serve as an appropriate analytical lens for examining the dynamics and flows of the economics underpinning networked cultural production. However, this requires innovation within field theory. Specifically, the mechanisms of conversion of one form capital to another must be examined in greater detail; something I have focused on in my thesis, and hope to develop further in the future.


Isis Hjorth was speaking to blog editor David Sutcliffe.

Isis Hjorth is a cultural sociologist focusing on emerging practices associated with networked technologies. She is currently researching microwork and virtual production networks in Sub-Saharan Africa and Southeast Asia.

Read more: Hjorth, I. (2014) Networked Cultural Production: Filmmaking in the Wreckamovie Community. PhD thesis. Oxford Internet Institute, University of Oxford, UK.

]]>
Edit wars! Measuring and mapping society’s most controversial topics https://ensr.oii.ox.ac.uk/edit-wars-measuring-mapping-societys-most-controversial-topics/ Tue, 03 Dec 2013 08:21:43 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2339 Ed: How did you construct your quantitative measure of ‘conflict’? Did you go beyond just looking at content flagged by editors as controversial?

Taha: Yes we did … actually, we have shown that controversy measures based on “controversial” flags are not inclusive at all and although they might have high precision, they have very low recall. Instead, we constructed an automated algorithm to locate and quantify the editorial wars taking place on the Wikipedia platform. Our algorithm is based on reversions, i.e. when editors undo each other’s contributions. We focused specifically on mutual reverts between pairs of editors and we assigned a maturity score to each editor, based on the total volume of their previous contributions. While counting the mutual reverts, we used more weight for those ones committed by/on editors with higher maturity scores; as a revert between two experienced editors indicates a more serious problem. We always validated our method and compared it with other methods, using human judgement on a random selection of articles.

Ed: Was there any discrepancy between the content deemed controversial by your own quantitative measure, and what the editors themselves had flagged?

Taha: We were able to capture all the flagged content, but not all the articles found to be controversial by our method are flagged. And when you check the editorial history of those articles, you soon realise that they are indeed controversial but for some reason have not been flagged. It’s worth mentioning that the flagging process is not very well implemented in smaller language editions of Wikipedia. Even if the controversy is detected and flagged in English Wikipedia, it might not be in the smaller language editions. Our model is of course independent of the size and editorial conventions of different language editions.

Ed: Were there any differences in the way conflicts arose / were resolved in the different language versions?

Taha: We found the main differences to be the topics of controversial articles. Although some topics are globally debated, like religion and politics, there are many topics which are controversial only in a single language edition. This reflects the local preferences and importances assigned to topics by different editorial communities. And then the way editorial wars initiate and more importantly fade to consensus is also different in different language editions. In some languages moderators interfere very soon, while in others the war might go on for a long time without any moderation.

Ed: In general, what were the most controversial topics in each language? And overall?

Taha: Generally, religion, politics, and geographical places like countries and cities (sometimes even villages) are the topics of debates. But each language edition has also its own focus, for example football in Spanish and Portuguese, animations and TV series in Chinese and Japanese, sex and gender-related topics in Czech, and Science and Technology related topics in French Wikipedia are very often behind editing wars.

Ed: What other quantitative studies of this sort of conflict -ie over knowledge and points of view- are there?

Taha: My favourite work is one by researchers from Barcelona Media Lab. In their paper Jointly They Edit: Examining the Impact of Community Identification on Political Interaction in Wikipedia they provide quantitative evidence that editors interested in political topics identify themselves more significantly as Wikipedians than as political activists, even though they try hard to reflect their opinions and political orientations in the articles they contribute to. And I think that’s the key issue here. While there are lots of debates and editorial wars between editors, at the end what really counts for most of them is Wikipedia as a whole project, and the concept of shared knowledge. It might explain how Wikipedia really works despite all the diversity among its editors.

Ed: How would you like to extend this work?

Taha: Of course some of the controversial topics change over time. While Jesus might stay a controversial figure for a long time, I’m sure the article on President (W) Bush will soon reach a consensus and most likely disappear from the list of the most controversial articles. In the current study we examined the aggregated data from the inception of each Wikipedia-edition up to March 2010. One possible extension that we are working on now is to study the dynamics of these controversy-lists and the positions of topics in them.

Read the full paper: Yasseri, T., Spoerri, A., Graham, M. and Kertész, J. (2014) The most controversial topics in Wikipedia: A multilingual and geographical analysis. In: P.Fichman and N.Hara (eds) Global Wikipedia: International and cross-cultural issues in online collaboration. Scarecrow Press.


Taha was talking to blog editor David Sutcliffe.

Taha Yasseri is the Big Data Research Officer at the OII. Prior to coming to the OII, he spent two years as a Postdoctoral Researcher at the Budapest University of Technology and Economics, working on the socio-physical aspects of the community of Wikipedia editors, focusing on conflict and editorial wars, along with Big Data analysis to understand human dynamics, language complexity, and popularity spread. He has interests in analysis of Big Data to understand human dynamics, government-society interactions, mass collaboration, and opinion dynamics.

]]>
The physics of social science: using big data for real-time predictive modelling https://ensr.oii.ox.ac.uk/physics-of-social-science-using-big-data-for-real-time-predictive-modelling/ Thu, 21 Nov 2013 09:49:27 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2320 Ed: You are interested in analysis of big data to understand human dynamics; how much work is being done in terms of real-time predictive modelling using these data?

Taha: The socially generated transactional data that we call “big data” have been available only very recently; the amount of data we now produce about human activities in a year is comparable to the amount that used to be produced in decades (or centuries). And this is all due to recent advancements in ICTs. Despite the short period of availability of big data, the use of them in different sectors including academia and business has been significant. However, in many cases, the use of big data is limited to monitoring and post hoc analysis of different patterns. Predictive models have been rarely used in combination with big data. Nevertheless, there are very interesting examples of using big data to make predictions about disease outbreaks, financial moves in the markets, social interactions based on human mobility patterns, election results, etc.

Ed: What were the advantages of using Wikipedia as a data source for your study — as opposed to Twitter, blogs, Facebook or traditional media, etc.?

Taha: Our results have shown that the predictive power of Wikipedia page view and edit data outperforms similar box office-prediction models based on Twitter data. This can partially be explained by considering the different nature of Wikipedia compared to social media sites. Wikipedia is now the number one source of online information, and Wikipedia article page view statistics show how much Internet users have been interested in knowing about a specific movie. And the edit counts — even more importantly — indicate the level of interest of the editors in sharing their knowledge about the movies with others. Both indicators are much stronger than what you could measure on Twitter, which is mainly the reaction of the users after watching or reading about the movie. The cost of participation in Wikipedia’s editorial process makes the activity data more revealing about the potential popularity of the movies.

Another advantage is the sheer availability of Wikipedia data. Twitter streams, by comparison, are limited in both size and time. Gathering Facebook data is also problematic, whereas all the Wikipedia editorial activities and page views are recorded in full detail — and made publicly available.

Ed: Could you briefly describe your method and model?

Taha: We retrieved two sets of data from Wikipedia, the editorial activity and the page views relating to our set of 312 movies. The former indicates the popularity of the movie among the Wikipedia editors and the latter among Wikipedia readers. We then defined different measures based on these two data streams (eg number of edits, number of unique editors, etc.) In the next step we combined these data into a linear model that assumes the more popular the movie is, the larger the size of these parameters. However this model needs both training and calibration. We calibrated the model based on the IMBD data on the financial success of a set of ‘training’ movies. After calibration, we applied the model to a set of “test” movies and (luckily) saw that the model worked very well in predicting the financial success of the test movies.

Ed: What were the most significant variables in terms of predictive power; and did you use any content or sentiment analysis?

Taha: The nice thing about this method is that you don’t need to perform any content or sentiment analysis. We deal only with volumes of activities and their evolution over time. The parameter that correlated best with financial success (and which was therefore the best predictor) was the number of page views. I can easily imagine that these days if someone wants to go to watch a movie, they most likely turn to the Internet and make a quick search. Thanks to Google, Wikipedia is going to be among the top results and it’s very likely that the click will go to the Wikipedia article about the movie. I think that’s why the page views correlate to the box office takings so significantly.

Ed: Presumably people are picking up on signals, ie Wikipedia is acting like an aggregator and normaliser of disparate environmental signals — what do you think these signals might be, in terms of box office success? ie is it ultimately driven by the studio media machine?

Taha: This is a very difficult question to answer. There are numerous factors that make a movie (or a product in general) popular. Studio marketing strategies definitely play an important role, but the quality of the movie, the collective mood of the public, herding effects, and many other hidden variables are involved as well. I hope our research serves as a first step in studying popularity in a quantitative framework, letting us answer such questions. To fully understand a system the first thing you need is a tool to monitor and observe it very well quantitatively. In this research we have shown that (for example) Wikipedia is a nice window and useful tool to observe and measure popularity and its dynamics; hopefully leading to a deep understanding of the underlying mechanisms as well.

Ed: Is there similar work / approaches to what you have done in this study?

Taha: There have been other projects using socially generated data to make predictions on the popularity of movies or movement in financial markets, however to the best of my knowledge, it’s been the first time that Wikipedia data have been used to feed the models. We were positively surprised when we observed that these data have stronger predictive power than previously examined datasets.

Ed: If you have essentially shown that ‘interest on Wikipedia’ tracks ‘real-world interest’ (ie box office receipts), can this be applied to other things? eg attention to legislation, political scandal, environmental issues, humanitarian issues: ie Wikipedia as “public opinion monitor”?

Taha: I think so. Now I’m running two other projects using a similar approach; one to predict election outcomes and the other one to do opinion mining about the new policies implemented by governing bodies. In the case of elections, we have observed very strong correlations between changes in the information seeking rates of the general public and the number of ballots cast. And in the case of new policies, I think Wikipedia could be of great help in understanding the level of public interest in searching for accurate information about the policies, and how this interest is satisfied by the information provided online. And more interestingly, how this changes overtime as the new policy is fully implemented.

Ed: Do you think there are / will be practical applications of using social media platforms for prediction, or is the data too variable?

Taha: Although the availability and popularity of social media are recent phenomena, I’m sure that social media data are already being used by different bodies for predictions in various areas. We have seen very nice examples of using these data to predict disease outbreaks or the arrival of earthquake waves. The future of this field is very promising, considering both the advancements in the methodologies and also the increase in popularity and use of social media worldwide.

Ed: How practical would it be to generate real-time processing of this data — rather than analysing databases post hoc?

Taha: Data collection and analysis could be done instantly. However the challenge would be the calibration. Human societies and social systems — similarly to most complex systems — are non-stationary. That means any statistical property of the system is subject to abrupt and dramatic changes. That makes it a bit challenging to use a stationary model to describe a continuously changing system. However, one could use a class of adaptive models or Bayesian models which could modify themselves as the system evolves and more data are available. All these could be done in real time, and that’s the exciting part of the method.

Ed: As a physicist; what are you learning in a social science department? And what does physicist bring to social science and the study of human systems?

Taha: Looking at complicated phenomena in a simple way is the art of physics. As Einstein said, a physicist always tries to “make things as simple as possible, but not simpler”. And that works very well in describing natural phenomena, ranging from sub-atomic interactions all the way to cosmology. However, studying social systems with the tools of natural sciences can be very challenging, and sometimes too much simplification makes it very difficult to understand the real underlying mechanisms. Working with social scientists, I’m learning a lot about the importance of the individual attributes (and variations between) the elements of the systems under study, outliers, self-awarenesses, ethical issues related to data, agency and self-adaptation, and many other details that are mostly overlooked when a physicist studies a social system.

At the same time, I try to contribute the methodological approaches and quantitative skills that physicists have gained during two centuries of studying complex systems. I think statistical physics is an amazing example where statistical techniques can be used to describe the macro-scale collective behaviour of billions and billions of atoms with a single formula. I should admit here that humans are way more complicated than atoms — but the dialogue between natural scientists and social scientists could eventually lead to multi-scale models which could help us to gain a quantitative understanding of social systems, thereby facilitating accurate predictions of social phenomena.

Ed: What database would you like access to, if you could access anything?

Taha: I have day dreams about the database of search queries from all the Internet users worldwide at the individual level. These data are being collected continuously by search engines and technically could be accessed, but due to privacy policy issues it’s impossible to get a hold on; even if only for research purposes. This is another difference between social systems and natural systems. An atom never gets upset being watched through a microscope all the time, but working on social systems and human-related data requires a lot of care with respect to privacy and ethics.

Read the full paper: Mestyán, M., Yasseri, T., and Kertész, J. (2013) Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data. PLoS ONE 8 (8) e71226.


Taha Yasseri was talking to blog editor David Sutcliffe.

Taha Yasseri is the Big Data Research Officer at the OII. Prior to coming to the OII, he spent two years as a Postdoctoral Researcher at the Budapest University of Technology and Economics, working on the socio-physical aspects of the community of Wikipedia editors, focusing on conflict and editorial wars, along with Big Data analysis to understand human dynamics, language complexity, and popularity spread. He has interests in analysis of Big Data to understand human dynamics, government-society interactions, mass collaboration, and opinion dynamics.

]]>
Verification of crowd-sourced information: is this ‘crowd wisdom’ or machine wisdom? https://ensr.oii.ox.ac.uk/verification-of-crowd-sourced-information-is-this-crowd-wisdom-or-machine-wisdom/ Tue, 19 Nov 2013 09:00:41 +0000 http://blogs.oii.ox.ac.uk/policy/?p=1528 Crisis mapping platform
‘Code’ or ‘law’? Image from an Ushahidi development meetup by afropicmusing.

In ‘Code and Other Laws of Cyberspace’, Lawrence Lessig (2006) writes that computer code (or what he calls ‘West Coast code’) can have the same regulatory effect as the laws and legal code developed in Washington D.C., so-called ‘East Coast code’. Computer code impacts on a person’s behaviour by virtue of its essentially restrictive architecture: on some websites you must enter a password before you gain access, in other places you can enter unidentified. The problem with computer code, Lessig argues, is that it is invisible, and that it makes it easy to regulate people’s behaviour directly and often without recourse.

For example, fair use provisions in US copyright law enable certain uses of copyrighted works, such as copying for research or teaching purposes. However the architecture of many online publishing systems heavily regulates what one can do with an e-book: how many times it can be transferred to another device, how many times it can be printed, whether it can be moved to a different format – activities that have been unregulated until now, or that are enabled by the law but effectively ‘closed off’ by code. In this case code works to reshape behaviour, upsetting the balance between the rights of copyright holders and the rights of the public to access works to support values like education and innovation.

Working as an ethnographic researcher for Ushahidi, the non-profit technology company that makes tools for people to crowdsource crisis information, has made me acutely aware of the many ways in which ‘code’ can become ‘law’. During my time at Ushahidi, I studied the practices that people were using to verify reports by people affected by a variety of events – from earthquakes to elections, from floods to bomb blasts. I then compared these processes with those followed by Wikipedians when editing articles about breaking news events. In order to understand how to best design architecture to enable particular behaviour, it becomes important to understand how such behaviour actually occurs in practice.

In addition to the impact of code on the behaviour of users, norms, the market and laws also play a role. By interviewing both the users and designers of crowdsourcing tools I soon realized that ‘human’ verification, a process of checking whether a particular report meets a group’s truth standards, is an acutely social process. It involves negotiation between different narratives of what happened and why; identifying the sources of information and assessing their reputation among groups who are considered important users of that information; and identifying gatekeeping and fact checking processes where the source is a group or institution, amongst other factors.

One disjuncture between verification ‘practice’ and the architecture of the verification code developed by Ushahidi for users was that verification categories were set as a default feature, whereas some users of the platform wanted the verification process to be invisible to external users. Items would show up as being ‘unverified’ unless they had been explicitly marked as ‘verified’, thus confusing users about whether the item was unverified because the team hadn’t yet verified it, or whether it was unverified because it had been found to be inaccurate. Some user groups wanted to be able to turn off such features when they could not take responsibility for data verification. In the case of the Christchurch Recovery Map in the aftermath of the 2011 New Zealand earthquake, the government officials with whom volunteers who set up the Ushahidi instance were working wanted to be able to turn off such features because they were concerned that they could not ensure that reports were indeed verified and having the category show up (as ‘unverified’ until ‘verified’) implied that they were engaged in some kind of verification process.

The existence of a default verification category impacted on the Christchurch Recovery Map group’s ability to gain support from multiple stakeholders, including the government, but this feature of the platform’s architecture did not have the same effect in other places and at other times. For other users like the original Ushahidi Kenya team who worked to collate instances of violence after the Kenyan elections in 2007/08, this detailed verification workflow was essential to counter the misinformation and rumour that dogged those events. As Ushahidi’s use cases have diversified – from reporting death and damage during natural disasters to political events including elections, civil war and revolutions, the architecture of Ushahidi’s code base has needed to expand. Ushahidi has recognised that code plays a defining role in the experience of verification practices, but also that code’s impact will not be the same at all times, and in all circumstances. This is why it invested in research about user diversity in a bid to understand the contexts in which code runs, and how these contexts result in a variety of different impacts.

A key question being asked in the design of future verification mechanisms is the extent to which verification work should be done by humans or non-humans (machines). Here, verification is not a binary categorisation, but rather there is a spectrum between human and non-human verification work, and indeed, projects like Ushahidi, Wikipedia and Galaxy Zoo have all developed different verification mechanisms. Wikipedia uses a set of policies and practices about how content should be added and reviewed, such as the use of ‘citation needed’ tags for information that sounds controversial and that should be backed up by a reliable source. Galaxy Zoo uses an algorithm to detect whether certain contributions are accurate by comparing them to the same work by other volunteers.

Ushahidi leaves it up to individual deployers of their tools and platform to make decisions about verification policies and practices, and is going to be designing new defaults to accommodate this variety of use. In parallel, Veri.ly, a project by ex-Ushahidi Patrick Meier with organisations Masdar and QCRI is responding to the large amounts of unverified and often contradictory information that appears on social media following natural disasters by enabling social media users to collectively evaluate the credibility of rapidly crowdsourced evidence. The project was inspired by MIT’s winning entry to DARPA’s ‘Red Balloon Challenge’ which was intended to highlight social networking’s potential to solve widely distributed, time-sensitive problems, in this case by correctly identifying the GPS coordinates of 10 balloons suspended at fixed, undisclosed locations across the US. The winning MIT team crowdsourced the problem by using a monetary incentive structure, promising $2,000 to the first person who submitted the correct coordinates for a single balloon, $1,000 to the person who invited that person to the challenge; $500 to the person who invited the inviter, and so on. The system quickly took root, spawning geographically broad, dense branches of connections. After eight hours and 52 minutes, the MIT team identified the correct coordinates for all 10 balloons.

Veri.ly aims to apply MIT’s approach to the process of rapidly collecting and evaluating critical evidence during disasters: “Instead of looking for weather balloons across an entire country in less than 9 hours, we hope Veri.ly will facilitate the crowdsourced collection of multimedia evidence for individual disasters in under 9 minutes.” It is still unclear how (or whether) Verily will be able to reproduce the same incentive structure, but a bigger question lies around the scale and spread of social media in the majority of countries where humanitarian assistance is needed. The majority of Ushahidi or Crowdmap installations are, for example, still “small data” projects, with many focused on areas that still require offline verification procedures (such as calling volunteers or paid staff who are stationed across a country, as was the case in Sudan [3]). In these cases – where the social media presence may be insignificant — a team’s ability to achieve a strong local presence will define the quality of verification practices, and consequently the level of trust accorded to their project.

If code is law and if other aspects in addition to code determine how we can act in the world, it is important to understand the context in which code is deployed. Verification is a practice that determines how we can trust information coming from a variety of sources. Only by illuminating such practices and the variety of impacts that code can have in different environments can we begin to understand how code regulates our actions in crowdsourcing environments.

For more on Ushahidi verification practices and the management of sources on Wikipedia during breaking news events, see:

[1] Ford, H. (2012) Wikipedia Sources: Managing Sources in Rapidly Evolving Global News Articles on the English Wikipedia. SSRN Electronic Journal. doi:10.2139/ssrn.2127204

[2] Ford, H. (2012) Crowd Wisdom. Index on Censorship 41(4), 33–39. doi:10.1177/0306422012465800

[3] Ford, H. (2011) Verifying information from the crowd. Ushahidi.


Heather Ford has worked as a researcher, activist, journalist, educator and strategist in the fields of online collaboration, intellectual property reform, information privacy and open source software in South Africa, the United Kingdom and the United States. She is currently a DPhil student at the OII, where she is studying how Wikipedia editors write history as it happens in a format that is unprecedented in the history of encyclopedias. Before this, she worked as an ethnographer for Ushahidi. Read Heather’s blog.

For more on the ChristChurch Earthquake, and the role of digital humanities in preserving the digital record of its impact see: Preserving the digital record of major natural disasters: the CEISMIC Canterbury Earthquakes Digital Archive project on this blog.

]]>
Can Twitter provide an early warning function for the next pandemic? https://ensr.oii.ox.ac.uk/can-twitter-provide-an-early-warning-function-for-the-next-flu-pandemic/ Mon, 14 Oct 2013 08:00:41 +0000 http://blogs.oii.ox.ac.uk/policy/?p=1241 Image by .
Communication of risk in any public health emergency is a complex task for healthcare agencies; a task made more challenging when citizens are bombarded with online information. Mexico City, 2009. Image by Eneas.

 

Ed: Could you briefly outline your study?

Patty: We investigated the role of Twitter during the 2009 swine flu pandemics from two perspectives. Firstly, we demonstrated the role of the social network to detect an upcoming spike in an epidemic before the official surveillance systems – up to week in the UK and up to 2-3 weeks in the US – by investigating users who “self-diagnosed” themselves posting tweets such as “I have flu / swine flu”. Secondly, we illustrated how online resources reporting the WHO declaration of “pandemics” on 11 June 2009 were propagated through Twitter during the 24 hours after the official announcement [1,2,3].

Ed: Disease control agencies already routinely follow media sources; are public health agencies  aware of social media as another valuable source of information?

Patty:  Social media are providing an invaluable real-time data signal complementing well-established epidemic intelligence (EI) systems monitoring online media, such as MedISys and GPHIN. While traditional surveillance systems will remain the pillars of public health, online media monitoring has added an important early-warning function, with social media bringing  additional benefits to epidemic intelligence: virtually real-time information available in the public domain that is contributed by users themselves, thus not relying on the editorial policies of media agencies.

Public health agencies (such as the European Centre for Disease Prevention and Control) are interested in social media early warning systems, but more research is required to develop robust social media monitoring solutions that are ready to be integrated with agencies’ EI services.

Ed: How difficult is this data to process? Eg: is this a full sample, processed in real-time?

Patty:  No, obtaining all Twitter search query results is not possible. In our 2009 pilot study we were accessing data from Twitter using a search API interface querying the database every minute (the number of results was limited to 100 tweets). Currently, only 1% of the ‘Firehose’ (massive real-time stream of all public tweets) is made available using the streaming API. The searches have to be performed in real-time as historical Twitter data are normally available only through paid services. Twitter analytics methods are diverse; in our study, we used frequency calculations, developed algorithms for geo-location, automatic spam and duplication detection, and applied time series and cross-correlation with surveillance data [1,2,3].

Ed: What’s the relationship between traditional and social media in terms of diffusion of health information? Do you have a sense that one may be driving the other?

Patty: This is a fundamental question. “Does media coverage of certain topic causes buzz on social media or does social media discussion causes media frenzy?” This was particularly important to investigate for the 2009 swine flu pandemic, which experienced unprecedented media interest. While it could be assumed that disease cases preceded media coverage, or that media discussion sparked public interest causing Twitter debate, neither proved to be the case in our experiment. On some days, media coverage for flu was higher, and on others Twitter discussion was higher; but peaks seemed synchronized – happening on the same days.

Ed: In terms of communicating accurate information, does the Internet make the job easier or more difficult for health authorities?

Patty: The communication of risk in any public health emergencies is a complex task for government and healthcare agencies; this task is made more challenging when citizens are bombarded with online information, from a variety of sources that vary in accuracy. This has become even more challenging with the increase in users accessing health-related information on their mobile phones (17% in 2010 and 31% in 2012, according to the US Pew Internet study).

Our findings from analyzing Twitter reaction to online media coverage of the WHO declaration of swine flu as a “pandemic” (stage 6) on 11 June 2009, which unquestionably was the most media-covered event during the 2009 epidemic, indicated that Twitter does favour reputable sources (such as the BBC, which was by far the most popular) but also that bogus information can still leak into the network.

Ed: What differences do you see between traditional and social media, in terms of eg bias / error rate of public health-related information?

Patty: Fully understanding quality of media coverage of health topics such as the 2009 swine flu pandemics in terms of bias and medical accuracy would require a qualitative study (for example, one conducted by Duncan in the EU [4]). However, the main role of social media, in particular Twitter due to the 140 character limit, is to disseminate media coverage by propagating links rather than creating primary health information about a particular event. In our study around 65% of tweets analysed contained a link.

Ed: Google flu trends (which monitors user search terms to estimate worldwide flu activity) has been around a couple of years: where is that going? And how useful is it?

Patty: Search companies such as Google have demonstrated that online search queries for keywords relating to flu and its symptoms can serve as a proxy for the number of individuals who are sick (Google Flu Trends), however, in 2013 the system “drastically overestimated peak flu levels”, as reported by Nature. Most importantly, however, unlike Twitter, Google search queries remain proprietary and are therefore not useful for research or the construction of non-commercial applications.

Ed: What are implications of social media monitoring for countries that may want to suppress information about potential pandemics?

Patty: The importance of event-based surveillance and monitoring social media for epidemic intelligence is of particular importance in countries with sub-optimal surveillance systems and those lacking the capacity for outbreak preparedness and response. Secondly, the role of user-generated information on social media is also of particular importance in counties with limited freedom of press or those that actively try to suppress information about potential outbreaks.

Ed: Would it be possible with this data to follow spread geographically, ie from point sources, or is population movement too complex to allow this sort of modelling?

Patty: Spatio-temporal modelling is technically possible as tweets are time-stamped and there is a support for geo-tagging. However, the location of all tweets can’t be precisely identified; however, early warning systems will improve in accuracy as geo-tagging of user generated content becomes widespread. Mathematical modelling of the spread of diseases and population movements are very topical research challenges (undertaken by, for example, by Colliza et al. [5]) but modelling social media user behaviour during health emergencies to provide a robust baseline for early disease detection remains a challenge.

Ed: A strength of monitoring social media is that it follows what people do already (eg search / Tweet / update statuses). Are there any mobile / SNS apps to support collection of epidemic health data? eg a sort of ‘how are you feeling now’ app?

Patty: The strength of early warning systems using social media is exactly in the ability to piggy-back on existing users’ behaviour rather than having to recruit participants. However, there are a growing number of participatory surveillance systems that ask users to provide their symptoms (web-based such as Flusurvey in the UK, and “Flu Near You” in the US that also exists as a mobile app). While interest in self-reporting systems is growing, challenges include their reliability, user recruitment and long-term retention, and integration with public health services; these remain open research questions for the future. There is also a potential for public health services to use social media two-ways – by providing information over the networks rather than only collect user-generated content. Social media could be used for providing evidence-based advice and personalized health information directly to affected citizens where they need it and when they need it, thus effectively engaging them in active management of their health.

References

[1.] M Szomszor, P Kostkova, C St Louis: Twitter Informatics: Tracking and Understanding Public Reaction during the 2009 Swine Flu Pandemics, IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology 2011, WI-IAT, Vol. 1, pp.320-323.

[2.]  Szomszor, M., Kostkova, P., de Quincey, E. (2010). #swineflu: Twitter Predicts Swine Flu Outbreak in 2009. M Szomszor, P Kostkova (Eds.): ehealth 2010, Springer Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering LNICST 69, pages 18-26, 2011.

[3.] Ed de Quincey, Patty Kostkova Early Warning and Outbreak Detection Using Social Networking Websites: the Potential of Twitter, P Kostkova (Ed.): ehealth 2009, Springer Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering LNICST 27, pages 21-24, 2010.

[4.] B Duncan. How the Media reported the first day of the pandemic H1N1) 2009: Results of EU-wide Media Analysis. Eurosurveillance, Vol 14, Issue 30, July 2009

[5.] Colizza V, Barrat A, Barthelemy M, Valleron AJ, Vespignani A (2007) Modeling the worldwide spread of pandemic influenza: Baseline case an containment interventions. PloS Med 4(1): e13. doi:10.1371/journal. pmed.0040013

Further information on this project and related activities, can be found at: BMJ-funded scientific film: http://www.youtube.com/watch?v=_JNogEk-pnM ; Can Twitter predict disease outbreaks? http://www.bmj.com/content/344/bmj.e2353 ; 1st International Workshop on Public Health in the Digital Age: Social Media, Crowdsourcing and Participatory Systems (PHDA 2013): http://www.digitalhealth.ws/ ; Social networks and big data meet public health @ WWW 2013: http://www2013.org/2013/04/25/social-networks-and-big-data-meet-public-health/


Patty Kostkova was talking to blog editor David Sutcliffe.

Dr Patty Kostkova is a Principal Research Associate in eHealth at the Department of Computer Science, University College London (UCL) and held a Research Scientist post at the ISI Foundation in Italy. Until 2012, she was the Head of the City eHealth Research Centre (CeRC) at City University, London, a thriving multidisciplinary research centre with expertise in computer science, information science and public health. In recent years, she was appointed a consultant at WHO responsible for the design and development of information systems for international surveillance.

Researchers who were instrumental in this project include Ed de Quincey, Martin Szomszor and Connie St Louis.

]]>
Who represents the Arab world online? https://ensr.oii.ox.ac.uk/arab-world/ Tue, 01 Oct 2013 07:09:58 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2190 Caption
Editors from all over the world have played some part in writing about Egypt; in fact, only 13% of all edits actually originate in the country (38% are from the US). More: Who edits Wikipedia? by Mark Graham.

Ed: In basic terms, what patterns of ‘information geography’ are you seeing in the region?

Mark: The first pattern that we see is that the Middle East and North Africa are relatively under-represented in Wikipedia. Even after accounting for factors like population, Internet access, and literacy, we still see less contact than would be expected. Second, of the content that exists, a lot of it is in European and French rather than in Arabic (or Farsi or Hebrew). In other words, there is even less in local languages.

And finally, if we look at contributions (or edits), not only do we also see a relatively small number of edits originating in the region, but many of those edits are being used to write about other parts of the word rather than their own region. What this broadly seems to suggest is that the participatory potentials of Wikipedia aren’t yet being harnessed in order to even out the differences between the world’s informational cores and peripheries.

Ed: How closely do these online patterns in representation correlate with regional (offline) patterns in income, education, language, access to technology (etc.) Can you map one to the other?

Mark: Population and broadband availability alone explain a lot of the variance that we see. Other factors like income and education also play a role, but it is population and broadband that have the greatest explanatory power here. Interestingly, it is most countries in the MENA region that fail to fit well to those predictors.

Ed: How much do you think these patterns result from the systematic imposition of a particular view point – such as official editorial policies – as opposed to the (emergent) outcome of lots of users and editors acting independently?

Mark: Particular modes of governance in Wikipedia likely do play a factor here. The Arabic Wikipedia, for instance, to combat vandalism has a feature whereby changes to articles need to be reviewed before being made public. This alone seems to put off some potential contributors. Guidelines around sourcing in places where there are few secondary sources also likely play a role.

Ed: How much discussion (in the region) is there around this issue? Is this even acknowledged as a fact or problem?

Mark: I think it certainly is recognised as an issue now. But there are few viable alternatives to Wikipedia. Our goal is hopefully to identify problems that lead to solutions, rather than simply discouraging people from even using the platform.

Ed: This work has been covered by the Guardian, Wired, the Huffington Post (etc.) How much interest has there been from the non-Western press or bloggers in the region?

Mark: There has been a lot of coverage from the non-Western press, particularly in Latin America and Asia. However, I haven’t actually seen that much coverage from the MENA region.

Ed: As an academic, do you feel at all personally invested in this, or do you see your role to be simply about the objective documentation and analysis of these patterns?

Mark: I don’t believe there is any such thing as ‘objective documentation.’ All research has particular effects in and on the world, and I think it is important to be aware of the debates, processes, and practices surrounding any research project. Personally, I think Wikipedia is one of humanity’s greatest achievements. No previous single platform or repository of knowledge has ever even come close to Wikipedia in terms of its scale or reach. However, that is all the more reason to critically investigate what exactly is, and isn’t, contained within this fantastic resource. By revealing some of the biases and imbalances in Wikipedia, I hope that we’re doing our bit to improving it.

Ed: What factors do you think would lead to greater representation in the region? For example: is this a matter of voices being actively (or indirectly) excluded, or are they maybe just not all that bothered?

Mark: This is certainly a complicated question. I think the most important step would be to encourage participation from the region, rather than just representation of the region. Some of this involves increasing some of the enabling factors that are the prerequisites for participation; factors like: increasing broadband access, increasing literacy, encouraging more participation from women and minority groups.

Some of it is then changing perceptions around Wikipedia. For instance, many people that we spoke to in the region framed Wikipedia as an American our outside project rather than something that is locally created. Unfortunately we seem to be currently stuck in a vicious cycle in which few people from the region participate, therefore fulfilling the very reason why some people think that they shouldn’t participate. There is also the issue of sources. Not only does Wikipedia require all assertions to be properly sourced, but secondary sources themselves can be a great source of raw informational material for Wikipedia articles. However, if few sources about a place exist, then it adds an additional burden to creating content about that place. Again, a vicious cycle of geographic representation.

My hope is that by both working on some of the necessary conditions to participation, and engaging in a diverse range of initiatives to encourage content generation, we can start to break out of some of these vicious cycles.

Ed: The final moonshot question: How would you like to extend this work; time and money being no object?

Mark: Ideally, I’d like us to better understand the geographies of representation and participation outside of just the MENA region. This would involve mixed-methods (large scale big data approaches combined with in-depth qualitative studies) work focusing on multiple parts of the world. More broadly, I’m trying to build a research program that maintains a focus on a wide range of Internet and information geographies. The goal here is to understand participation and representation through a diverse range of online and offline platforms and practices and to share that work through a range of publicly accessible media: for instance the ‘Atlas of the Internet’ that we’re putting together.


Mark Graham was talking to blog editor David Sutcliffe.

Mark Graham is a Senior Research Fellow at the OII. His research focuses on Internet and information geographies, and the overlaps between ICTs and economic development.

]]>
Harnessing ‘generative friction’: can conflict actually improve quality in open systems? https://ensr.oii.ox.ac.uk/harnessing-generative-friction-can-conflict-improve-quality-in-open-systems/ Wed, 14 Aug 2013 12:18:35 +0000 http://blogs.oii.ox.ac.uk/policy/?p=2111 Wikipedia
Image from “The Iraq War: A Historiography of Wikipedia Changelogs“, a twelve-volume set of all changes to the Wikipedia article on the Iraq War (totalling over 12,000 changes and almost 7,000 pages), by STML.

Ed: I really like the way that, contrary to many current studies on conflict and Wikipedia, you focus on how conflict can actually be quite productive. How did this insight emerge?

Kim: I was initially looking for instances of collaboration in Wikipedia to see how popular debates about peer production played out in reality. What I found was that conflict was significantly more prevalent than I had assumed. It struck me as interesting, as most of the popular debates at the time framed conflict as hindering the collaborative editorial process. After several stages of coding, I found that the conversations that involved even a minor degree of conflict were fascinating. A pattern emerged where disagreements about the editorial process resulted in community members taking positive actions to solve the discord and achieve consensus. This was especially prominent in early discussions prior to 2005 before many of the policies that regulate content production in the encyclopaedia were formulated. The more that differing points of view and differing evaluative frames came into contact, the more the community worked together to generate rules and norms to regulate and improve the production of articles.

Ed: You use David Stark’s concept of generative friction to describe how conflict is ‘central to the editorial processes of Wikipedia’. Can you explain why this is important?

Kim: Having different points of view come into contact is the premise of Wikipedia’s collaborative editing model. When these views meet, Stark maintains there is an overlap of individuals’ evaluative frames, or worldviews, and it is in this overlap that creative solutions to problems can occur. People come across solutions they may not otherwise have encountered in the typical homogeneous, hierarchical system that is traditionally the standard for institutions trying to maximize efficiency. In this respect, conflict is central to the process as it is about the struggle to negotiate meaning and achieve a consensus among editors with differing opinions and perspectives. Conflict can therefore be framed as generative, given it can result in innovative solutions to problems identified in the editorial process. In Wikipedia’s case this can be seen through the creation of policies to regulate this process, or developing technical tools to automate repetitive editing tasks, and the like. When thinking about large, collaborative systems where more views are coming into contact, then this research points to the fact that opening up processes that have traditionally been closed, like encyclopaedic print production, or indeed government or institutional processes, can result in creative and innovative solutions to problems.

Ed: This ‘generative friction’ is different from what you sometimes see on Wikipedia articles, where conflict degenerates into personal attacks. Did you find any evidence of this in your case study? Can this type of conflict ‘poison’ the others?

Kim: I actually found relatively few discussions where competing evaluative frames resulted in editors engaging in personal attacks. I was initially quite surprised by this finding as I was familiar with Wikipedia’s early edit wars. On further examination of the conversations, I found that editors often referred to Wikipedia’s policies as a way to manage debate and keep conflict to a minimum. For example, by referring to policies on civility to keep behaviour within community norms, or by referring to policies on verifiability to explain why some content sources aren’t acceptable, relatively few instances of conflict devolved into personal attacks.

I do, however, feel that it is really important to further examine the role that conflict plays in the editorial process. At what point does conflict stop being productive and actually start to impede the production of quality content? What role does conflict play in the participation pattern of different social groups? There is still considerable research to be done on the role of conflict in Wikipedia, especially if we are to have a more nuanced understanding of how the encyclopaedia actually works.

Similarly, if we are to apply this to the concept of open government and politics, or transparency in public policy and public institutions, then these forums will need to know whether they are providing truly open and inclusive online or open spaces, or simply reflecting the most dominant voices.

Ed: You refer in your paper to how Wikipedia has changed over time. Can you talk a bit more about this and whether there are good longitudinal studies that you referred to?

Kim: Tracing conversations about an article over time has provided a snapshot of not only how the topic has been viewed and constructed in that time period, but also of how Wikipedia has been constructed as both a platform and an encyclopaedia. When Wikipedia’s Australia article (which my case study was based on) was a new entry, editors worked together to discuss and talk out larger structural and ideological issues about the article. Who would be reading the article? Where should the inbox go? Should there be a standardised format across the encyclopaedia? How should articles be organised? As the article matured and the editorial community grew, discussions on the article talk page tended to be more content specific.

This finding should be taken in light of the study by Viégas et al. (2007) who found that active editors’ involvement with Wikipedia changes over time, from initially having a local (article) focus, to being more involved with issues of quality and the overall health of the community. This may account for how early active contributors to the “Australia” article were not present in more recent discussions on the talk page of the article. Indeed there have been a number of excellent studies and accounts of how the behaviour of editors has changed over time, including Suh et al. (2009) who found participation in Wikipedia to be declining, attributable in part to the conflict between existing active editors and new contributors, along with increased costs for managing the community as a whole.

These studies, and others like them, are really important for contributing to a wider understanding of Wikipedia and how it works, as it is only with more research about open collaboration and how it is played out, that we can apply the lessons learned to other situations.

Ed: What do you think is the relevance of this research to other avenues?

Kim: Societies are becoming more aware of the importance of active citizenship and involving diverse sections of the community in public consultation, and much of this activity can be carried out over the Internet. I would hope that this research adds to scholarship about participation in online spaces, be they social, political, cultural or civic. While it is about Wikipedia in particular, I hope that it adds to a growing knowledge base from which we can start to draw similarities and differences about how a variety of online communities operate, and the role of conflict in these spaces. So that rather than relying on discourses about the conflict that results when many voices and views meet in an open space, we can start as researchers, to investigate how friction and debate play out in reality. Because I do think that it is important to recognise the constructive role that conflict can play in a community like Wikipedia.

I also feel it’s really important to conduct more research on the role of conflict in online communities, as we don’t really know yet at what point the conflict stops being generative and starts to hinder the processes of a particular community. For instance, how does it affect the participation of conflict-avoiding cultures in different Wikipedias? How does it affect the participation of women? We know from the Wikimedia Foundation’s own research that these groups are significantly under-represented in the editorial community. So while conflict can play a positive role in content creation and production and this needs to be acknowledged, further research on conflict needs to consider how it affects participation in open spaces.

References

Stark, D. 2011. The sense of dissonance: Accounts of worth in economic life. Princeton, New Jersey: Princeton University Press.

Suh, B., Convertino, G., Chi, E. H. & Pirolli, P. 2009. The singularity is not near: The slowing growth of Wikipedia. Proceedings from WikiSym’09, 2009 International Symposium on Wikis, Orlando, Florida, U.S.A, October 25–27, 2009, Article 8. doi: 10.1145/1641309.1641322.

Viégas, Fernanda. B., Martin Wattenberg, Jesse Kriss, Frank van Ham. 2007. Talk before you type: Coordination in Wikipedia. In 40th Annual Hawaii International Conference on System Sciences, Hawaii, USA, January 3-6, 2007, 78. New York: ACM.


Read the full paper: Osman, K. (1013) The role of conflict in determining consensus on quality in Wikipedia articles. Presented at WikiSym ’13, 5-7 August 2013, Hong Kong, China.

Kim Osman is a PhD candidate at the ARC Centre of Excellence for Creative Industries and Innovation at the Queensland University of Technology. She is currently investigating the history of Wikipedia as a new media institution. Kim’s research interests include regulation and diversity in open environments, the social construction of technologies, and controversies in the history of technology.

Kim Osman was talking to blog editor Heather Ford.

]]>
Online crowd-sourcing of scientific data could document the worldwide loss of glaciers to climate change https://ensr.oii.ox.ac.uk/online-crowd-sourcing-of-scientific-data-could-document-the-worldwide-loss-of-glaciers-to-climate-change/ Tue, 14 May 2013 09:12:33 +0000 http://blogs.oii.ox.ac.uk/policy/?p=1021 Ed: Project Pressure has created a platform for crowdsourcing glacier imagery, often photographs taken by climbers and trekkers. Why are scientists interested in these images? And what’s the scientific value of the data set that’s being gathered by the platform?

Klaus: Comparative photography using historical photography allows year-on-year comparisons to document glacier change. The platform aims to create long-lasting scientific value with minimal technical entry barriers — it is valuable to have a global resource that combines photographs generated by Project Pressure in less documented areas, with crowdsourced images taken by for example by climbers and trekkers, combined with archival pictures. The platform is future focused and will hopefully allow an up-to-date view on glaciers across the planet.

The other ways for scientists to monitor glaciers takes a lot of time and effort; direct measurements of snow fall is a complicated, resource intensive and time-consuming process. And while glacier outlines can be traced from satellite imagery, this still needs to be done manually. Also, you can’t measure the thickness, images can be obscured by debris and cloud cover, and some areas just don’t have very many satellite fly-bys.

Ed: There are estimates that the glaciers of Montana’s Glacier National Park will likely to be gone by 2020 and the Ugandan glaciers by 2025, and the Alps are rapidly turning into a region of lakes. These are the famous and very visible examples of glacier loss — what’s the scale of the missing data globally?

Klaus: There’s a lot of great research being conducted in this area, however there are approximately 300,000 glaciers world wide, with huge data gaps in South America and the Himalayas for instance. Sharing of Himalayan data between Indian and Chinese scientists has been a sensitive issue, given glacier meltwater is an important strategic resource in the region. But this is a popular trekking route, and it is relatively easy to gather open-source data from the public. Furthermore, there are also numerous national and scientific archives with images lying around that don’t have a central home.

Ed: What metadata are being collected for the crowdsourced images?

Klaus: The public can upload their own photos embedded with GPS, compass direction, and date. This data is aggregated into a single managed platform. With GPS becoming standard in cameras, it’s very simple contribute to the project — taking photos with embedded GPS data is almost foolproof. The public can also contribute by uploading archival images and adding GPS data to old photographs.

Ed: So you are crowd sourcing the gathering of this data; are there any plans to crowd-source the actual analysis?

Klaus: It’s important to note that accuracy is very important in a database, and the automated (or semiautomated) process of data generation should result in good data. And while the analytical side should be done be professionals, we are making the data open source so it can be used in education for instance. We need to take harness what crowds are good at, and know what the limitations are.

Ed: You mentioned in your talk that the sheer amount of climate data — and also the way it is communicated — means that the public has become disconnected from the reality and urgency of climate change: how is the project working to address this? What are the future plans?

Klaus: Recent studies have demonstrated a disconnect between scientific information regarding climate change and the public. The problem is not access to scientific information, but the fact that is can be overwhelming. Project Pressure is working to reconnect the public with the urgency of the problem by inspiring people to action and participation, and to engage with climate change. Project Pressure is very scalable in terms of the scientific knowledge required to use the platform: from kids to scientists. On the interface one can navigate the world, find locations and directions of photographs, and once funding permits we will also add the time-dimension.

Ed: Project Pressure has deliberately taken a non-political stance on climate change: can you explain why?

Klaus: Climate change has unfortunately become a political subject, but we want to preserve our integrity by not taking a political stance. It’s important that everyone can engage with Project Pressure regardless of their political views. We want to be an independent, objective partner.

Ed: Finally .. what’s your own background? How did you get involved?

Klaus: I’m the founder, and my background is in communication and photography. Input on how to strengthen the conceptualisation has come from a range of very smart people; in particular, Dr M. Zemph from the World Glacier Monitoring Service has been very valuable.


Klaus Thymann was talking at the OII on 18 March 2013; he talked later to blog editor David Sutcliffe.

]]>
Crowdsourcing translation during crisis situations: are ‘real voices’ being excluded from the decisions and policies it supports? https://ensr.oii.ox.ac.uk/crowdsourcing-translation-during-crisis-situations-are-real-voices-being-excluded-from-the-decisions-and-policies-it-supports/ Tue, 07 May 2013 08:58:47 +0000 http://blogs.oii.ox.ac.uk/policy/?p=957 As revolution spread across North Africa and the Middle East in 2011, participants and observers of the events were keen to engage via social media. However, saturation by Arab-language content demanded a new translation strategy for those outside the region to follow the information flows — and for those inside to reach beyond their domestic audience. Crowdsourcing was seen as the most efficient strategy in terms of cost and time to meet the demand, and translation applications that harnessed volunteers across the internet were integrated with nearly every type of ICT project. For example, as Steve Stottlemyre has already mentioned on this blog, translation played a part in tools like the Libya Crisis Map, and was essential for harnessing tweets from the region’s ‘voices on the ground.’

If you have ever worried about media bias then you should really worry about the impact of translation. Before the revolutions, the translation software for Egyptian Arabic was almost non-existent. Few translation applications were able to handle the different Arabic dialects or supply coding labor and capital to build something that could contend with internet blackouts. Google’s Speak to Tweet became the dominant application used in the Egyptian uprisings, delivering one homogenized source of information that fed the other sources. In 2011, this collaboration helped circumvent the problem of Internet connectivity in Egypt by allowing cellphone users to call their tweet into a voicemail to be transcribed and translated. A crowd of volunteers working for Twitter enhanced translation of Egyptian Arabic after the Tweets were first transcribed by a Mechanical Turk application trained from an initial 10 hours of speech.

The unintended consequence of these crowdsourcing applications was that when the material crossed the language barrier into English, it often became inaccessible to the original contributors. Individuals on the ground essentially ceded authorship to crowds of untrained volunteer translators who stripped the information of context, and then plotted it in categories and on maps without feedback from original sources. Controlling the application meant controlling the information flow, the lens through which the revolutions were conveyed to the outside world.

This flawed system prevented the original sources (e.g. in Libya) from interacting with the information that directly related to their own life-threatening situation, while the information became an unsound basis for decision-making by international actors. As Stottlemyre describes, ceding authorship was sometimes an intentional strategy, but also one imposed by the nature of the language/power imbalance and the failure of the translation applications and the associated projects to incorporate feedback loops or more two-way communication.

The after action report for the Libya Crisis Map project commissioned by the UN OCHA offers some insight into the disenfranchisement of sources to the decision-making process once they had provided information for the end product; the crisis map. In the final ‘best practices section’ reviewing the outcomes, The Standby Task Force which created the map described decision-makers and sources, but did not consider or mention the sources’ access to decision-making, the map, or a mechanism by which they could feed back to the decision-making chain. In essence, Libyans were not seen as part of the user group of the product they helped create.

How exactly does translation and crowdsourcing shape our understanding of complex developing crises, or influence subsequent policy decisions?  The SMS polling initiative launched by Al Jazeera English in collaboration with Ushahidi, a prominent crowdsourcing platform, illustrates the most common process of visualizing crisis information: translation, categorization, and mapping.  In December 2011, Al Jazeera launched Somalia Speaks, with the aim of giving a voice to the people of Somalia and sharing a picture of how violence was impacting everyday lives. The two have since repeated this project in Mali, to share opinions about the military intervention in the north.  While Al Jazeera is a news organization, not a research institute or a government actor, it plays an important role in informing electorates who can put political pressure on governments involved in the conflict. Furthermore, this same type of technology is being used on the ground to gather information in crisis situations at the governmental and UN levels.

A call for translators in the diaspora, particularly Somali student groups, was issued online, and phones were distributed on the ground throughout Somalia so multiple users could participate. The volunteers translated the SMSs and categorized the content as either political, social, or economic. The results were color-coded and aggregated on a map.

SMS-translation

The stated goal of the project was to give a voice to the Somali people, but the Somalis who participated had no say in how their voices were categorized or depicted on the map. The SMS poll asked an open question:

How has the Somalia conflict affected your life?

In one response example:

The Bosaso Market fire has affected me. It happened on Saturday.

The response was categorized as ‘social.’ But why didn’t the fact that violence happened in a market, an economic centre, denote ‘economic’ categorization? There was no guidance for maintaining consistency among the translators, nor any indication of how the information would be used later. It was these categories chosen by the translators, represented as bright colorful circles on the map, which were speaking to the world, not the Somalis — whose voices had been lost through a crowdsourcing application that was designed with a language barrier. The primary sources could not suggest another category that better suited the intentions of their responses, nor did they understand the role categories would play in representing and visualizing their responses to the English language audience.

Somalia Crisis Map

An 8 December 2011 comment on the Ushahidi blog described in compelling terms how language and control over information flow impact the power balance during a conflict:

A—-, My friend received the message from you on his phone. The question says “tell us how is conflict affecting your life” and “include your name of location”. You did not tell him that his name will be told to the world. People in Somalia understand that sms is between just two people. Many people do not even understand the internet. The warlords have money and many contacts. They understand the internet. They will look at this and they will look at who is complaining. Can you protect them? I think this project is not for the people of Somalia. It is for the media like Al Jazeera and Ushahidi. You are not from here. You are not helping. It is better that you stay out.

Ushahidi director Patrick Meier, responded to the comment:

Patrick: Dear A—-, I completely share your concern and already mentioned this exact issue to Al Jazeera a few hours ago. I’m sure they’ll fix the issue as soon as they get my message. Note that the question that was sent out does *not* request people to share their names, only the name of their general location. Al Jazeera is careful to map the general location and *not* the exact location. Finally, Al Jazeera has full editorial control over this project, not Ushahidi.

As of 14 January 2012, there were still names featured on the Al Jazeera English website.

The danger is that these categories — economic, political, social — become the framework for aid donations and policy endeavors; the application frames the discussion rather than the words of the Somalis. The simplistic categories become the entry point for policy-makers and citizens alike to understand and become involved with translated material. But decisions and policies developed from the translated information are less connected to ‘real voices’ than we would like to believe.

Developing technologies so that Somalis or Libyans — or any group sharing information via translation — are themselves directing the information flow about the future of their country should be the goal, rather than perpetual simplification into the client / victim that is waiting to be given a voice.

]]>
Did Libyan crisis mapping create usable military intelligence? https://ensr.oii.ox.ac.uk/did-libyan-crisis-mapping-create-usable-military-intelligence/ Thu, 14 Mar 2013 10:45:22 +0000 http://blogs.oii.ox.ac.uk/policy/?p=817 The Middle East has recently witnessed a series of popular uprisings against autocratic rulers. In mid-January 2011, Tunisian President Zine El Abidine Ben Ali fled his country, and just four weeks later, protesters overthrew the regime of Egyptian President Hosni Mubarak. Yemen’s government was also overthrown in 2011, and Morocco, Jordan, and Oman saw significant governmental reforms leading, if only modestly, toward the implementation of additional civil liberties.

Protesters in Libya called for their own ‘day of rage’ on February 17, 2011, marked by violent protests in several major cities, including the capitol Tripoli. As they transformed from ‘protestors’ to ‘Opposition forces’ they began pushing information onto Twitter, Facebook, and YouTube, reporting their firsthand experiences of what had turned into a civil war virtually overnight. The evolving humanitarian crisis prompted the United Nations to request the creation of the Libya Crisis Map, which was made public on March 6, 2011. Other, more focused crisis maps followed, and were widely distributed on Twitter.

While the map was initially populated with humanitarian information pulled from the media and online social networks, as the imposition of an internationally enforced No Fly Zone (NFZ) over Libya became imminent, information began to appear on it that appeared to be of a tactical military nature. While many people continued to contribute conventional humanitarian information to the map, the sudden shift toward information that could aid international military intervention was unmistakable.

How useful was this information, though? Agencies in the U.S. Intelligence Community convert raw data into useable information (incorporated into finished intelligence) by utilizing some form of the Intelligence Process. As outlined in the U.S. military’s joint intelligence manual, this consists of six interrelated steps all centered on a specific mission. It is interesting that many Twitter users, though perhaps unaware of the intelligence process, replicated each step during the Libyan civil war; producing finished intelligence adequate for consumption by NATO commanders and rebel leadership.

It was clear from the beginning of the Libyan civil war that very few people knew exactly what was happening on the ground. Even NATO, according to one of the organization’s spokesmen, lacked the ground-level informants necessary to get a full picture of the situation in Libya. There is no public information about the extent to which military commanders used information from crisis maps during the Libyan civil war. According to one NATO official, “Any military campaign relies on something that we call ‘fused information’. So we will take information from every source we can… We’ll get information from open source on the internet, we’ll get Twitter, you name any source of media and our fusion centre will deliver all of that into useable intelligence.”

The data in these crisis maps came from a variety of sources, including journalists, official press releases, and civilians on the ground who updated blogs and/or maintaining telephone contact. The @feb17voices Twitter feed (translated into English and used to support the creation of The Guardian’s and the UN’s Libya Crisis Map) included accounts of live phone calls from people on the ground in areas where the Internet was blocked, and where there was little or no media coverage. Twitter users began compiling data and information; they tweeted and retweeted data they collected, information they filtered and processed, and their own requests for specific data and clarifications.

Information from various Twitter feeds was then published in detailed maps of major events that contained information pertinent to military and humanitarian operations. For example, as fighting intensified, @LibyaMap’s updates began to provide a general picture of the battlefield, including specific, sourced intelligence about the progress of fighting, humanitarian and supply needs, and the success of some NATO missions. Although it did not explicitly state its purpose as spreading mission-relevant intelligence, the nature of the information renders alternative motivations highly unlikely.

Interestingly, the Twitter users featured in a June 2011 article by the Guardian had already explicitly expressed their intention of affecting military outcomes in Libya by providing NATO forces with specific geographical coordinates to target Qadhafi regime forces. We could speculate at this point about the extent to which the Intelligence Community might have guided Twitter users to participate in the intelligence process; while NATO and the Libyan Opposition issued no explicit intelligence requirements to the public, they tweeted stories about social network users trying to help NATO, likely leading their online supporters to draw their own conclusions.

It appears from similar maps created during the ongoing uprisings in Syria that the creation of finished intelligence products by crisis mappers may become a regular occurrence. Future study should focus on determining the motivations of mappers for collecting, processing, and distributing intelligence, particularly as a better understanding of their motivations could inform research on the ethics of crisis mapping. It is reasonable to believe that some (or possibly many) crisis mappers would be averse to their efforts being used by military commanders to target “enemy” forces and infrastructure.

Indeed, some are already questioning the direction of crisis mapping in the absence of professional oversight (Global Brief 2011): “[If] crisis mappers do not develop a set of best practices and shared ethical standards, they will not only lose the trust of the populations that they seek to serve and the policymakers that they seek to influence, but (…) they could unwittingly increase the number of civilians being hurt, arrested or even killed without knowing that they are in fact doing so.”


Read the full paper: Stottlemyre, S., and Stottlemyre, S. (2012) Crisis Mapping Intelligence Information During the Libyan Civil War: An Exploratory Case Study. Policy and Internet 4 (3-4).

]]>
Preserving the digital record of major natural disasters: the CEISMIC Canterbury Earthquakes Digital Archive project https://ensr.oii.ox.ac.uk/preserving-the-digital-record-of-major-natural-disasters-the-ceismic-canterbury-earthquakes-digital-archive-project/ Fri, 29 Jun 2012 09:57:55 +0000 http://blogs.oii.ox.ac.uk/policy/?p=277 The 6.2 magnitude earthquake that struck the centre of Christchurch on 22 February 2011 claimed 185 lives, damaged 80% of the central city beyond repair, and forced the abandonment of 6000 homes. It was the third costliest insurance event in history. The CEISMIC archive developed at the University of Canterbury will soon have collected almost 100,000 digital objects documenting the experiences of the people and communities affected by the earthquake, all of it available for study.

The Internet can be hugely useful to coordinate disaster relief efforts, or to help rebuild affected communities. Paul Millar came to the OII on 21 May 2012 to discuss the CEISMIC archive project and the role of digital humanities after a major disaster (below). We talked to him afterwards.

Ed: You have collected a huge amount of information about the earthquake and people’s experiences that would otherwise have been lost: how do you think it will be used?

Paul: From the beginning I was determined to avoid being prescriptive about eventual uses. The secret of our success has been to stick to the principles of open data, open access and collaboration — the more content we can collect, the better chance future generations have to understand and draw conclusions from our experiences, behaviour and decisions. We have already assisted a number of research projects in public health, the social and physical sciences; even accounting. One of my colleagues reads balance sheets the way I read novels, and discovers all sorts of earthquake-related signs of cause and effect in them. I’d never have envisaged such a use for the archive. We have made our ontology is as detailed and flexible as possible in order to help with re-purposing of primary material: we currently use three layers of metadata — machine generated, human-curated and crowd sourced. We also intend to work more seriously on our GIS capabilities.

Ed: How do you go about preserving this information during a period of tremendous stress and chaos? Was it difficult to convince people of the importance of this longer-term view?

Paul: There was no difficulty convincing people of the importance of what we were doing: everyone got it immediately. However, the scope of this disaster is difficult to comprehend, even for those of us who live with it every day. We’ve lost a lot of material already, and we’re losing more everyday. Our major telecommunications provider recently switched off its CDMA network — all those redundant phones are gone, and with them any earthquake pictures or texts that might have been stored. One of the things I’d encourage every community to do now is make an effort to preserve key information against a day of disaster. If we’d digitised all our architectural plans of heritage buildings and linked them electronically to building reports and engineering assessments, we might have saved more.

Ed: It seems obvious in hindsight that the Internet can (and should be) be tremendously useful in the event of this sort of disaster: how do we ensure that best use is made?

Paul: The first thing is to be prepared, even in a low-key way, for whatever might happen. Good decision-making during a disaster requires accurate, accessible, and comprehensive data: digitisation and data linking are key activities in the creation of such a resource — and robust processes to ensure that information is of high quality are vital. One of the reasons CEISMIC works is because it is a federated archive — an ideal model for this sort of event — and we were able to roll it out extremely quickly. We could also harness online expert communities, crowd-sourcing efforts, open sourcing of planning processes, and robust vetting of information and auditing of outcomes. A lot of this needs to be done before a disaster strikes, though. For years I’ve encountered the mantra ‘we support research but we don’t fund databases’. We had to build CEISMIC because there was no equivalent, off-the-shelf product — but that development process lost us a year at least.

Ed: What equivalent efforts are there to preserve information about major disasters?

Paul: The obvious ones are the world-leading projects out of Center for History and New Media at George Mason University, including their 9/11 Digital Archive. One problem for any archive of this nature is that information doesn’t exist in a free and unmediated space. For example, the only full record of the pre-quake Christchurch cityscape is historic Google Street View; one of the most immediate sources of quake information was Twitter; many people communicated with the world via Facebook, and so on. It’s a question we’re all engaging with: who owns that information? How will it be preserved and accessed? We’ve had a lot of interest in what we are doing, and plenty of consultation and discussion with groups who see our model as being of some relevance to them. The UC CEISMIC project is essentially a proof of concept — versions of it could be rolled out around the world and left to tick over in the background, quietly accumulating material in the event that it is needed one day. That’s a small cost alongside losing a community’s heritage.

Ed: What difficulties have you encountered in setting up the archive?

Paul: Where do I start? There were the personal difficulties — my home damaged, my family traumatised, the university damaged, staff and students all struggling in different ways to cope: it’s not the ideal environment to try and introduce a major IT project. But I felt I had to do something, partly as a therapeutic response. I saw my engineering and geosciences colleagues at the front of the disaster, explaining what was happening, helping to provide context and even reassurance. For quite a while I wondered what on earth a professor of literature could do. It was James Smithies – now CEISMIC’s Project Manager – who reminded me of the 9/11 Archive. The difficulties we’ve encountered since have been those that beset most under-resourced projects — trying to build a million dollar project on a much smaller budget. A lot of the future development will be funding dependent, so much of my job will be getting the word out and looking for sponsors, supporters and partners. But although we’re understaffed, over-worked and living in a shaky city, the resilience, courage, humanity and good will of so many people never ceases to amaze and hearten me.

Ed: Your own research area is English Literature: has that had any influence on the sorts of content that have been collected, or your own personal responses to it?

Paul: My interest in digital archiving started when teaching New Zealand Literature at Victoria University of Wellington. In a country this small most books have a single print run of a few hundred; and even our best writers are lucky to have a text make it to a second edition. I therefore encountered the problem that many of the texts I wanted to prescribe were out of print: digitisation seemed like a good solution. In New Zealand the digital age has negated distance — the biggest factor preventing us from immediate and meaningful engagement with the rest of the world. CEISMIC actually started life as an acronym (the Canterbury Earthquakes Images, Stories and Media Integrated Collection), and the fact that ‘stories’ sits centrally certainly represents my own interest in the way we use narratives to make sense of experience. Everyone who went through the earthquakes has a story, and every story is different. I’m fascinated by the way a collective catastrophe becomes so much more meaningful when it is broken down into individual narratives. Ironically, despite the importance of this project to me, I find the earthquakes extremely difficult to write about in any personal or creative way. I haven’t written my own earthquake story yet.


Paul Millar was talking to blog editor David Sutcliffe.

]]>