Uncovering the patterns and practice of censorship in Chinese news sites

Google China
In January 2010 Google announced that in response to a Chinese-originated hacking attack they would stop censoring searches in China and would pull out of the country if necessary. Image by Cory M. Grenier.

Ed: How much work has been done on censorship of online news in China? What are the methodological challenges and important questions associated with this line of enquiry?

Sonya: Recent research is paying much attention to social media and aiming to quantify their censorial practices and to discern common patterns in them. Among these empirical studies, Bamman et al.’s (2012) work claimed to be “the first large-scale analysis of political content censorship” that investigates messages deleted from Sina Weibo, a Chinese equivalent to Twitter. On an even larger scale, King et al. (2013) collected data from nearly 1,400 Chinese social media platforms and analyzed the deleted messages. Most studies on news censorship, however, are devoted to narratives of special cases, such as the closure of Freeing Point, an outspoken news and opinion journal, and the blocking of the New York Times after it disclosed the wealth possessed by the family of Chinese former premier Wen Jiabao.

The shortage of news censorship research could be attributed to several methodological challenges. First, it is tricky to detect censorship to begin with, given the word ‘censorship’ is one of the first to be censored. Also, news websites will not simply let their readers hit a glaring “404 page not found”. Instead, they will use a “soft 404”, which returns a “success” code for a request of a deleted web page and takes readers to a (different) existing web page. While humans may be able to detect these soft 404s, it will be harder for computer programs (eg run by researchers) to do so. Moreover, because different websites employ varying soft 404 techniques, much labor is required to survey them and to incorporate the acquired knowledge into a generic monitoring tool.

Second, high computing power and bandwidth are required to handle the large amount of news publications and the slow network access to Chinese websites. For instance, NetEase alone publishes 8,000 – 10,000 news articles every day. Meanwhile, the Internet connection between the Chinese cyberspace and the outer world is fairly slow and it takes more than a second to check one link because the Great Firewall checks both incoming and outgoing Internet traffic. These two factors translate to 2-3 hours for a single program to check one day’s news publications of NetEase alone. If we fire up too many programs to accelerate the progress, the database system and/or the network connection may be challenged. In my case, even though I am using high performance computers at Michigan State University to conduct this research, they are overwhelmed every now and then.

Despite all the difficulties, I believe it is of great importance to reveal censored news stories to the public, especially to the audience inside China who do not enjoy a free flow of information. Censored news is a special type of information, as it is too inconvenient to exist in authorities’ eyes and it is deemed important to citizens’ everyday lives. For example, the outbreak of SARS had been censored from Chinese media presumably to avoid spoiling the harmonious atmosphere created for the 16th National Congress of the Communist Party. This allowed the virus to develop into a worldwide epidemic. Like SARS, a variety of censored issues are not only inconvenient but also crucial, because the authorities would not otherwise allocate substantial resources to monitor or eliminate them if they were merely trivial. Therefore, after censored news is detected, it is vital to seek effective and efficient channels to disclose it to the public so as to counterbalance potential damage that censorship may entail.

Ed: You found that party organs, ie news organizations tightly affiliated with the Chinese Communist Party, published a considerable amount of deleted news. Was this surprising?

Sonya: Yes, I was surprised when looking at the results the first time. To be exact, our finding is that commercial media experience a higher deletion rate, but party organs contribute the most deleted news by sheer volume, reflecting the fact that party organs possess more resources allocated by the central and local governments and therefore have the capacity to produce more news. Consequently, party organs have a higher chance of publishing controversial information that may be deleted in the future, especially when a news story becomes sensitive for some reason that is hard to foresee. For example, investigations of some government officials started when netizens recognized them in the news with different luxury watches and other expensive accessories. As such, even though party organs are obliged to write odes to the party, they may eventually backfire on the cadres if the beautiful words are discovered to be too far from reality.

Ed: How sensitive are citizens to the fact that some topics are actively avoided in the news media? And how easy is it for people to keep abreast of these topics (eg the “three Ts” of Tibet, Taiwan, and Tiananmen) from other information sources?

Sonya: This question highlights the distinction between pre-censorship and post-censorship. Our study looked at post-censorship, ie information that is published but subsequently deleted. By contrast, the topics that are “actively avoided” fall under the category of pre-censorship. I am fairly convinced that the current pre- and post-censorship practice is effective in terms of keeping the public from learning inconvenient facts and from mobilizing for collective action. If certain topics are consistently wiped from the mass media, how will citizens ever get to know about them?

The Tiananmen Square protest, for instance, has never been covered by Chinese mass media, leaving an entire generation growing up since 1989 that is ignorant of this historical event. As such, if younger Chinese citizens have never heard of the Tiananmen Square protest, how could they possibly start an inquiry into this incident? Or, if they have heard of it and attempt to learn about it from the Internet, what they will soon realize is that domestic search engines, social media, and news media all fail their requests and foreign ones are blocked. Certainly, they could use circumvention tools to bypass the Great Firewall, but the sad truth is that probably under 1% of them have ever made such an effort, according to the Harvard Berkman Center’s report in 2011.

Ed: Is censorship of domestic news (such as food scares) more geared towards “avoiding panics and maintaining social order”, or just avoiding political embarrassment? For example, do you see censorship of environmental issues and (avoidable) disasters?

Sonya: The government certainly tries to avoid political embarrassment in the case of food scares by manipulating news coverage, but it is also their priority to maintain social order or so-called “social harmony”. Exactly for this reason, Zhao Lianhai, the most outspoken parent of a toxic milk powder victim was charged with “inciting social disorder” and sentenced to two and a half years in prison. Frustrated by Chinese milk powder, Chinese tourists are aggressively stocking up on milk powder from elsewhere, such as in Hong Kong and New Zealand, causing panics over milk powder shortages in those places.

After the earthquake in Sichuan, another group of grieving parents were arrested on similar charges when they questioned why their children were buried under crumbled schools whereas older buildings remained standing. The high death toll of this earthquake was among the avoidable disasters that the government attempts to mask and force the public to forget. Environmental issues, along with land acquisition, social unrest, and labor exploitation, are other frequently censored topics in the name of “stability maintenance”.

Ed: You plotted a map to show the geographic distribution of news deletion: what does the pattern show?

Sonya: We see an apparent geographic pattern in news deletion, with neighboring countries being more likely to be deleted than distant ones. Border disputes between China and its neighbors may be one cause; for example with Japan over the Diaoyu-Senkaku Islands, with the Philippines over the Huangyan Island-Scarborough Shoal, and with India over South Tibet. Another reason may be a concern over maintaining allies. Burma had the highest deletion rates among all the countries, with the deleted news mostly covering its curb on censorship. Watching this shift, China might worry that media reform in Burma could lead to copycat attempts inside China.

On the other hand, China has given Burma diplomatic cover, considering it as the “second coast” to the Indian Ocean and importing its natural resources (Howe & Knight, 2012). For these reasons, China may be compelled to censor Burma more than other countries, even though they don’t share a border. Nonetheless, although oceans apart, the US topped the list by sheer number of news deletions, reflecting the bittersweet relation between the two nations.

Ed: What do you think explains the much higher levels of censorship reported by others for social media than for news media? How does geographic distribution of deletion differ between the two?

Sonya: The deletion rates of online news are apparently far lower than those of Sina Weibo posts. The overall deletion rates on NetEase and Sina Beijing were 0.05% and 0.17%, compared to 16.25% on the social media platform (Bamman et al., 2012). Several reasons may help explain this gap. First, social media confronts enduring spam that has to be cleaned up constantly, whereas it is not a problem at all for professional news aggregators. Second, self-censorship practiced by news media plays an important role, because Chinese journalists are more obliged and prepared to self-censor sensitive information, compared to ordinary Chinese citizens. Subsequently, news media rarely mention “crown prince party” or “democracy movement”, which were among the most frequently deleted terms on Sina Weibo.

Geographically, the deletion rates across China have distinct patterns on news media and social media. Regarding Sina Weibo, deletion rates increase when the messages are published near the fringe or in the west where the economy is less developed. Regarding news websites, the deletion rates rise as they approach the center and east, where the economy is better developed. In addition, the provinces surrounding Beijing also have more news deleted, meaning that political concerns are a driving force behind content control.

Ed: Can you tell if the censorship process mostly relies on searching for sensitive keywords, or on more semantic analysis of the actual content? ie can you (or the censors..) distinguish sensitive “opinions” as well as sensitive topics?

Sonya: First, too sensitive topics will never survive pre-censorship or be published on news websites, such as the Tiananmen Square protest, although they may sneak in on social media with deliberate typos or other circumvention techniques. However, it is clear that censors use keywords to locate articles on sensitive topics. For instance, after the Fukushima earthquake in 2011, rumors spread in the Chinese Cyberspace that radiation was rising from the Japanese nuclear plant and iodine would help protect against its harmful effects; this was followed by panic-buying of iodized salt. During this period, “nuclear defense”, “iodized salt” and “radioactive iodine”–among other generally neutral terms–became politically charged overnight, and were censored in the Chinese web sphere. The taboo list of post-censorship keywords evolves continuously to handle breaking news. Beyond keywords, party organs and other online media are trying to automate sentiment analysis and discern more subtle context. People’s Daily, for instance, has been working with elite Chinese universities in this field and already developed a generic product for other institutes to monitor “public sentiment”.

Another way to sort out sensitive information is to keep an eye on most popular stories, because a popular story would represent a greater “threat” to the existing political and social order. In our study, about 47% of the deleted stories were listed as top 100 mostly read/discussed at some point. This indicates that the more readership a story gains, the more attention it draws from censors.

Although news websites self-censor (therefore experiencing under 1% post-censorship), they are also required to monitor and “clean” comments following each news article. According to my very conservative estimate–if a censor processes 100 comments per minute and works eight hours per day–reviewing comments on Sina Beijing from 11-16 September 2012, would have required 336 censors working full time. In fact, Charles Cao, CEO of Sina, mentioned to Forbes that at least 100 censors were “devoted to monitoring content 24 hours a day”. As new sensitive issues emerge and new circumvention techniques are developed continuously, it is an ongoing battle between the collective intelligence of Chinese netizens and the mechanical work conducted (and artificial intelligence implemented) by a small group of censors.

Ed: It must be a cause of considerable anxiety for journalists and editors to have their material removed. Does censorship lead to sanctions? Or is the censorship more of an annoyance that must be negotiated?

Sonya: Censorship does indeed lead to sanctions. However, I don’t think “anxiety” would be the right word to describe their feelings, because if they are really anxious they could always choose self-censorship and avoid embarrassing the authorities. Considering it is fairly easy to predict whether a news report will please or irritate officials, I believe what fulfills the whistleblowers when they disclose inconvenient facts is a strong sense of justice and tremendous audacity. Moreover, I could barely discern any “negotiation” in the process of censorship. Negotiation is at least a two-way communication, whereas censorship follows continual orders sent from the authorities to the mass media, and similarly propaganda is a one-way communication from the authorities to the masses via the media. As such, it is common to see disobedient journalists threatened or punished for “defying” censorial orders.

Southern Metropolis Daily is one of China’s most aggressive and punished newspapers. In 2003, the newspaper broke the epidemic of SARS that local officials had wished to hide from the public. Soon after this report, it covered a university graduate beaten to death in policy custody because he carried no proper residency papers. Both cases received enormous attention from Chinese authorities and the international community, seriously embarrassing local officials. It is alleged and widely believed that some local officials demanded harsh penalties for the Daily; the director and the deputy editor were sentenced to 11 and 12 years in jail for “taking briberies” and “misappropriating state-owned assets” and the chief editor was dismissed.

Not only professional journalists but also (broadly defined) citizen journalists could face similar penalties. For instance, Xu Zhiyong, a lawyer who defended journalists on trial, and Ai Weiwei, an artist who tried to investigate collapsed schools after the Sichuan earthquake, have experienced similar penalties: fines for tax evasion, physical attacks, house arrest, and secret detainment; exactly the same censorship tactics that states carried out before the advent of the Internet, as described in Ilan Peleg’s (1993) book Patterns of Censorship Around the World.

Ed: What do you think explains the lack of censorship in the overseas portal? (Could there be a certain value for the government in having some news items accessible to an external audience, but unavailable to the internal one?)

Sonya: It is more costly to control content by searching for and deleting individual news stories than simply blocking a whole website. For this reason, when a website outside the Great Firewall carries embarrassing content to the Chinese government, Chinese censors will simply block the whole website rather than request deletions. Overseas branches of Chinese media may comply but foreign media may simply drop such a deletion request.

Given online users’ behavior, it is effective and efficient to strictly control domestic content. In general, there are two types of Chinese online users, those who only visit Chinese websites operating inside China and those who also consume content from outside the country. Regarding this second type, it is really hard to prescribe what they do and don’t read, because they may be well equipped with circumvention tools and often obtain access to Chinese media published in Hong Kong and Taiwan but blocked in China. In addition, some Western media, such as the BBC, the New York Times, and Deutsche Welle, make media consumption easy for Chinese readers by publishing in Chinese. Of course, this type of Chinese user may be well educated and able to read English and other foreign languages directly. Facing these people, Chinese authorities would see their efforts in vain if they tried to censor overseas branches of Chinese media, because, outside the Great Firewall, there are too many sources for information that lie beyond the reach of Chinese censors.

Chinese authorities are in fact strategically wise in putting their efforts into controlling domestic online media, because this first type of Chinese user accounts for 99.9% of the whole online population, according to Google’s 2010 estimate. In his 2013 book Rewire, Ethan Zuckerman summarizes this phenomenon: “none of the top ten nations [in terms of online population] looks at more than 7 percent international content in its fifty most popular news sites” (p. 56). Since the majority of the Chinese populace perceives the domestic Internet as “the entire cyberspace”, manipulating the content published inside the Great Firewall means that (according to Chinese censors) many of the time bombs will have been defused.

Read the full paper: Sonya Yan Song, Fei Shen, Mike Z. Yao, Steven S. Wildman (2013) Unmasking News in Cyberspace: Examining Censorship Patterns of News Portal Sites in China. Presented at “China and the New Internet World”, International Communication Association (ICA) Preconference, Oxford Internet Institute, University of Oxford, June 2013.

Sonya Y. Song led this study as a Google Policy Fellow in 2012. Currently, she is a Knight-Mozilla OpenNews Fellow and a Ph.D. candidate in media and information studies at Michigan State University. Sonya holds a bachelor’s and master’s degree in computer science from Tsinghua University in Beijing and master of philosophy in journalism from the University of Hong Kong. She is also an avid photographer, a devotee of literature, and a film buff.

Sonya Yan Song was talking to blog editor David Sutcliffe.