Sampling the Social Graph using Facebook Graph API
Sunday, April 25th, 2010Recently introduced Facebook Graph API represents an interesting source of data with a nice easy-to-appreciate context to it (everyone loves social). In order to motivate some of the examples in the blog, I have written up a simple quick&dirty Graph API client in Java :
that provides trivial-to-use interface for graph data processing :
SimpleGraphAPIClient client = new SimpleGraphAPIClient(fbToken);
LinkedList<FacebookUser> friends = client.getFriends();
for (FacebookUser friend : friends) {
LinkedList<LikedEntry> likes = client.getLikes(friend.getID());
LinkedList<PhotoEntry> photos = client.getPhotos(friend.getID());
LinkedList<GroupEntry> groups = client.getGroups(friend.getID());
// arbitrary dataset creation logic
}From a pure tool-perspective (without actually having an active fb app) - the dataset that can be generated is quite limited (bounded to the “neighborhood” of single user) - but even with that, a lot of interesting “play” data can be derived. For example, at minimum, we can get a (num_likes, num_photos, num_groups) data for all “friend” users and to that we can add some “derived” metrics like average group size, photo age, etc. Modeling this data alone can motivate some very interesting problems.
Here is a simple plot of (num_likes, num_photos, num_groups) dataset of 170 anonymous users obtained in this manner:

(Note - some of the data that Graph API returns occassionaly doesn’t match actual state on the site - so some outliers might be just missing data on fb side. However, this (systematic bias) is what might make the dataset especially interesting
)


