When is a group of online strangers, interacting with the same online system, a community? Using means of network analysis to understand shape, form and existence of personal relationships on Reddit.com is not only an interesting coding exercise, but yields interesting network analysis results. Launched in 2005, the website is currently one of the largest so-called virtual communities and has an Alexa ranking of 134, being within the 50 most visited websites in the United States. Reddit.com is a social news website that allows both the submission of external links (websites, images, videos, etc.) and the creation of written posts. Both types of submissions afford commenting the submission in form of a threaded conversation. This means that not only the original submission, but also individual comments, can be commented. In addition, all forms of interaction can be voted on by other members of the community in form of promotion and demotion.
I developed a python script able to extract JSON information from single Reddit submission and transform the data into a GraphML file, which can be analysed by popular network analysis software like NodeXL, IGraph or Pajek. The script is available for free to everyone willing to keep the copyright notice intact (please see the script itself for the full copyright notice).
More information: available on Felix’s website