China's Twitter Censorship Speed Measured By Computer Scientists

Computer scientists at Rice University Houston, Texas, conducted a study into just how quick Chinese authorities can censor messages posted on their version of Twitter, known as Weibo.  With the data collected, they've been able to determine censorship patterns, techniques, and the workforce size to undertake the post deleting operation.

Weibo, identical to Twitter in service (140 character messages with @usernames and #hashtags), currently has 300 million users who send 100 million messages per day at the rate of 70,000 every minute.  With these kinds of numbers, the question doesn't necessarily begin at "how fast," rather just "how?"

The method of this experiment, conducted by Dan Wallach of Rice University, has his team collect posts from a set of 3,500 users once every minute, timing and tracking to see if these posts later became unavailable.  Out of all the posts made by these users in the 12 day timeframe, around 4,500 were deleted every day, 30% of which were gone within 5-30 minutes, and nearly 90% in 24 hours.

Commonly deleted phrases included the words “support Syrian rebels”, “Lying of gov. (Jixiang)”, “One-Child policy abuse” and “group sex." 

This shows an extremely robust government team of internet monitors, and the data also shows their exact schedule!  As to maintain full strength at key times, they don't follow the 8-5 working day, rather more sporadic shifts to pick up the highest amount of activity.  They observed a slowdown of the pace of censoring at night time, probably a reduction of the people working, and a backlog of posts each morning.  “They catch up by late morning or early afternoon,” conclude Wallach and co.

What all of this means is the censorship operation is happening in real time, since the highest volume of deletions occur within 10 minutes.  Crunching the numbers further, if an average censor can scan around 50 posts every minute, this would require 1400 censors working at any one time to handle the 70,000 posts published every minute.  And if they work 8 hour shifts, the whole undertaking would require 4200 people on the payroll!

The researchers also pulled out some hypotheses as to the technological aid they receive.  The first is keyword alerting, flagging posts up for review whenever a keyword appears; but as the Chinese language is extremely complex for this kind of filtration (add on top of that the shortened language used on Weibo), it becomes notoriously difficult.

The other, more realistic hypothesis, is that the authorities target users with a history of deletions.  Wallach and co. did the same thing for this experiment, and they noted users with larger deletion frequencies tended to see faster censorship of their work.

A really eye-opening study into the censorship regulation of Chinese authorities.  A few things are brought in for question: why allow the post to be public in the first place, when you could just have them submit content for approval? How does Weibo priorities content for deletion?  While the stats speak for themselves, the hypotheses made may be realistic predictions; but are still predictions.

Crazy.

Source: arxiv