One of the key ways that I try to make a contribution to social science is through the application of novel interfaces and techniques to research design. This goes back to graduate school, where I was especially entranced with ways to make social networks more accessible. While I still actively participate in the generation and presentation of social networks, I think that there is lots to be done outside the explicit articulation of relationships as network structures. Below are some projects that I’ve either coded or participated in the design of in some form. Most recently, I’ve taken a back seat to coding while I’ve watched the work of my D.Phil student Joshua Melville bloom. My github page is here (user: berniehogan).
People are not always great at finding out where togo for advice in their network. We might have a high degree of social capital, but if we forget who is available or why they might be useful, then we cannot make use of them. CollegeConnect is an extension of NameGenWeb that shows a personal network using a custom layout and some extra features that we think are helpful for the college going process. At present, CollegeConnect is being reworked so that it can be used for a very specific study in collaboration with the University of Michigan and Michigan State University.
CollegeConnect was designed by me and the backend and initial layouts were coded by Steve McKellar. Steve is a great guy, though I hear he’s taken a break from coding as of late. Working with Steve I was able to learn a huge number of new technologies and watch them interlock in ways I never considered before. CollegeConnect uses neo4j, mongo, redid, less, node, backbone, jquery, sigma, grunt, and a slew of other relatively new technologies. Josh Melville has taken over day-to-day maintenance, though I also check in code from time to time.
NameGenWeb (2011 – present)
Perhaps my most successful application is a relatively simple one that is likely to be retired in April 2015, through no fault of my own. NameGenWeb is a Facebook application that lets you download your Facebook network. Its neither the first, the last nor the most extensive. However, one great thing about the current version is how we are able to provide a live visualisation of a Facebook network within a reasonably short time frame. Earlier versions required us to download the network and then open it in a program such as Gephi. This is my second foray into Sigma, and was almost entirely coded by my student Josh Melville.
Mapping Wikipedia (2012)
While working on data on Wikipedia, Mark Graham and I were approached by Gavin Bailey of TraceMedia. Gavin wanted to find a high dimension data set that he could use for a new point-based mapping approach that works on top of Google Maps. You have likely seen a Google map with a few points on it, or even with a few hundred. But using the current technologies, when you add more than a few thousand points, the Google Maps API starts to creak pretty badly. Gavin used the nascent OpenLayers framework on top of Google Maps to be able to display hundreds of thousands of points in a browser based on a query. Some of the queries that we were able to produce from this were incredibly insightful. Especially in the timeline version of the maps, one can see some of the unusual quirks of article creation on Wikipedia. My personal favourite is to watch how Arabic articles emerge in the US in 2007-2008. It would appear that someone has a bot that they periodically feed with latitude and longitude coordinates by state. Every few days an entire block of articles appear instantly.
This app has some cool tricks under the hood, such as the way queries can be saved. It was a featured technology at Wikisym 2012 and appeared in the Guardian on April 4, 2012
Since being at the Oxford Internet Institute, I have focused most of my development time on a program for capturing and analyzing Facebook networks. The simple version can be found at Facebook Apps. The more sophisticated version may never leave the lab due to licencing issues. However, you can see screenshots here.
When I was at the University of Toronto, I was one of the early folks involved in the computational analysis of egocentered networks. Many people involved in egocentered analysis are not particularly technical. So having a means by which one can load a series of networks and then calculate a number of key stats on all at once seemed like a big step forward. With an aspiring young coder, Wojciech Gryc, we began work on Egotistics. Years later, I’m still researching egocentered networks, and Wocjiech has started his own software company for cloud services, Canopy Labs.
The program runs in Java. With it, you can load a series of GraphML files, and then apply a statistic to either the entire ego net or to all of the alters in all of the networks. Part of Wojciech’s interest was in being able to code and test a series of network algorithms. One in particular that we had a lot of fun toying with was categorical assortativity, or a measure of how distinct the nodes in the network are by category. To this day, I’m not certain there are many other programs that do this.
It is more than a batch processing suite, as it builds a dataset of metrics as you go along. This dataset can then be used for multi-level analysis, or just used in your favorite stats package. It imports graphML, raw text files and pajek files.
NameGen is a program for entering data. I built the first incarnation of NameGen, used in the Connected Lives project. The second generation of this software was coded primarily by Jeffrey Wong, and to a lmuch lesser extent, myself. The new version is far more generic and can be used as an all purpose network data entry tool. It presently exports to GraphML.
This is really just a simple script to be used in the GUESS graph visualization program.
It creates a new java tab that allows the user to interact with graphs using smart buttons rather than python code. It is meant to help bring the power and elegance of GUESS to the masses. One of my favorite features of this script is its ability to export changes to a file that you can then reapply to other similar graphs (or to this graph if you start adding new data).
This is a program to allow customizable mailings to posters in CraigsList. It is currently in use and not for distribution until the project is completed. Feel free to email me for more info.