Following last year’s humble attempt to provide some insight from the twitter conversations around #APSA2012 (specially considering the last minute cancellation of the conference) – and given that other duties restricted me from attending APSA this year 🙁 – I will be collecting and displaying some data from this year’s conversations. There will be more updates throughout the conference. If you want to follow the chronological reports, you need to start from bottom to top.
Short methods note: Edges are created by mentions, replies or re-tweets. Nodes are coloured according to the components, and their size is scaled according to eigenvector centrality. Isolates (ie. people not talking to anyone but using the hashtag #APSA2013) are not included.
1. DATA: Someone asked me for the data I used to produce this post, and I strongly believe in the importance or replication. Here it is a list of all the tweet IDs I used. Sorry, but that’s the only way I can share it without violating Twitter’s TOS –> DATA
2. I plotted all the geotagged tweets against the map of Chicago. This gives a better sense of where the tweets where concentrated around the city.
UPDATE 10 (AND FINAL): A few comments before I introduce the data. This exercise had two purposes. First, I wanted to freshen up my skills on Twitter data collection and analysis. After spending part of the summer learning a lot on Python, R and SNA (mainly thanks to the International Summer School 2013 “Social Network Analysis: Internet Research”), I decided that an extension of last year’s analysis on the APSA tweets would be a good opportunity. In total honesty, I hope you enjoyed it too. Second, my research agenda uses extensively this type of social media data to draw inferences about political behaviour. Although this particular exercise was extremely self-centred, since I’m focusing on the interactions in a Political Science conference, it provides some insight on what social media data can tell us, and how can we use it to make sense of bigger issues. That’s why I decided to write this other post on Obama’s speech this week, to show some “real life” examples. Also, I realise that I’m not new on this field, and that there are amazing people working on these issues for a long time (most of them with much more sophisticated analyses than mine). I believe in building community, so I tried to attribute their work where appropriate and link to their own websites and Twitter accounts. I extremely recommend you to follow them and their research. Finally, this post will eventually become a paper-like longer post, with more descriptive data and some interesting questions to test. I can’t promise when, but it will come.
Ok, now let’s go to the data analysis. Joshua Tucker (NYU) tweeted today his excitement for being in the “top ten vertices” list from a Twitter SNA made by Marc Smith, using NodeXL. I’ve used NodeXL in the past (and I believe is an amazing off-the-shelf tool for Windows user), but its reliance on the Search API made me realise that I could get better results by downloading the data via the streaming API for the full duration of the conference. It requires more time and resources, but the results are much more informative. Then, I decided to create my own top ten, but using eigenvector centrality instead of betweenness centrality (as in the NodeXL list). The reason is simple: the former relies on the relative importance of the connections of a node. That is, if the people I interact with are more “important” (or central) in the network, I become more important too. Betweenness centrality, on the other hand, focuses on who are the bridges across different nodes, who is more able to connect the rest. Although that is usually an important question in network analysis (actually, I co-authored a paper with Jorge Fábrega where we use it extensively), in substantive terms eigenvector centrality seems more appropriate for the type of network we have here. With that info in mind, here are the winners:
Table: Top 10 accounts according to their Eigenvector Centrality.
|RANK||DAY 1||DAY 2||DAY 3||DAY 4||OVERALL|
In terms of volume, day 4 was the smallest one. With only 114 nodes and 141 edges, the conversations were less frequent. A possible explanation is that most of the delegates had already gone by then, and only those who had panels on that day were staying around the conference venues. The clusters are a bit more institutional, with high prominence form APSA’s official accounts, along with some blogs and websites (such as @monkeycageblog and @insidehighered). A new addition is Larry Sabato, from U. Virginia
The cumulative network does not show many differences from yesterday. This is not surprising, because most of the activity took place before, and most of the communications were between people who already tweeted each other before. The new interactions might have added some weight to the already existing edges, but not much more. In any case, here is the final network of the APSA 2013 Annual Meeting:
UPDATE 9: Day 3 was clearly quieter than the precious two. A bit of it might be the classical effect of people leaving after they present, or simply wandering around Chicago. It might also be that the panels are becoming increasingly more interesting, and people prefer to pay attention to the presentations instead of tweeting ;). In any case with all the fuss around President Obama’s speech on Syria (Hint: I recently published a quick report on that), I was expecting that IR crowd attending the conference would be very active. Well, just by simple observation of their accounts, they were, but did not necessarily use the #apsa2013 hashtag to express their views. That said, @dandrezner and @ezraklein are some of the “stars” of today’s network, with a high level of eigenvector centrality. The Political Communications cluster remains active with @andrew_chadwick, @rasmus_kleiss, and @25lettori leading the way (clearly a clique around Royal Holloway’s New Political Communication Unit).
Moving on to the cumulative graph, the network is not becoming much bigger (832 nodes in total). This reflects the lower number of conversations from day 3, but also that some ties are already established and some people keep talking to each other. The APSA team is doing really well in driving the conversation, with @APSAtweets and @APSAmeetings as really central nodes in the network. As expected, those who were central yesterday, remain so today, so no news on that regard. All in all, the network seems to be coming to a point of “convergence” or “stability”, with conversations taking place among the same members and with no significant cliques outside the big group. The question of inter-field dialogue remains open, as some relevant nodes in the network belong to different components (such as @ezraklein in comparison with the rest of the bigger component).
(QUICK) UPDATE 8: Using Pablo Barberá’s StreamR package (along with ggplot2), I mapped the tweets that had location data in them (only 19 out of 1321). Not surprisingly, most of them are highly concentrated in Chicago, but a couple appear to be somewhere else in the US. This goes towards question whether people not attending the conference are getting any benefit by tweeting about it. There were no geolocated tweets outside the US, in case you were wondering.
UPDATE 7: This is the final summary of day 2. The next 2 days I aim to produce just one daily report, so you’ll have to bear with me. Again, I present two graphs. The first one is the full network for all the days of the conference (including pre-conference events). The second one contains all tweets captured at day 2 until 7.30pm.
The cumulative network shows again a big component in pink, but the network is becoming much more diverse than in previous iterations. More clusters appear, while others that were disconnected (such as the one lead by @funglode) are now connected to the bigger network. The usual suspects remain as key actors in the network, and depending on the volume of tweets over the weekend, they will probably remain in that position. Some well-known IR scholars do not belong to the bigger component, which is an interesting phenomenon. If we look at @ezraklein or @SlaughterAM, they are connected to the big network, but form clusters around them (perhaps the cross-field conversations are not as clear as I thought) The Political Communications group is highly active, especially @andrew_chadwick, @zizip and @davekarpf (who also shared a widely tweeted panel today, which might also account for their relevance in the network).
An important notice is that this exercise is, in some way, a performative process. While I publish these networks, some people become aware of their own position and the people they interact with. That is always something to take into consideration when doing the analysis, which brings some epistemological discussions to the table (this is like Schrödinger’s cat reporting on its own experiment).
UPDATE 6: This time I’m bringing two graphs. The first one corresponds to the cumulative network. That is, the Twitter conversations from the pre-conference events until the last update. The second graph corresponds only to the conversations taking place during day 2 until 1pm CT. As you can notice, there are similarities among the networks, such as the existence of a big component in the middle (the cumulative network uses strongly connected components to colour the nodes). However, the central actors vary a bit. There are some accounts that remain relevant and central to the network, such as @apsatweets, @dandrezner, @texasinafrica, @raulpacheco, @ezraklein and @j_a_tucker. However, we can observe some new actors coming into the scene, such as @heathbrown and the institutional account for @insidehighered. Also, there is an interesting cluster formed by @funglode and @anniavaldez, formed mainly by Spanish-speaking users.
The field boundaries seem more diffused now, which brings questions about whether conferences actually create the opportunity for cross-field dialogue. There are several panels trying to analyse the overall role of Political Science, and how can we communicate better with our audiences. Maybe that is driving a lot of the conversations. That’s an interesting hypothesis to test. Another interesting fact is that some central nodes are people who are not attending APSA this year (such as myself :)). This also brings a question about who benefits from the conference, and if it is necessary to attend to obtain some basic returns from it. Obviously, we need to get data from other sources outside Twitter to find that out. In the meantime, this has become more than a simple exercise of mapping APSA.
UPDATE 5: This graph is a lot bigger than the previous one, as it brings together the data form the pre-conference events plus the day 1 (August, 29). Thanks again to @jorgefabrega for the help using the Search API to retrieve that data (I know, the search API might not be the best option to get an accurate picture, but it’s the only one I had available. If you want a thorough discussion of the representativeness of the different Twitter APIs – mainly the Streaming API – I would definitely encourage you to look at Mostatter et al. 2013)
Back to business. I made some small changes to the visualisation this time. I used strongly connected components instead of weakly connected components. First, it made more sense since the network is directed. Second, with the weakly connected component we got a big group in the middle where almost everyone was connected, which is not true. Also, one of my goals is to analyse the networks and try to make a comparison by sections/fields affiliation (if anyone is interested in helping with that, please let me know in the comments section!). This time we have 479 nodes and 823 edges.
I’m currently collecting data from today’s sessions, and will provide a daily graph and an accumulated one. Let’s see how that works. As usual, feedback is more than welcome.
UPDATE 4: Last graph of the day (it’s pretty late here in London). This corresponds to an accumulated network of the entire first day of #APSA2013 until 7pm, Chicago time. Now the network is much bigger than the previous one (it seems that conversations take some time to build up) with 327 nodes and 489 edges. The clusters we saw in the previous graph are much more diffused now. We can observe a big central component (in green) that connects most members of the network. However, it is possible to observe some patterns in the conversation that can be attributed to different fields or the type of Twitter accounts (oddly enough, publishers’ accounts tend to mention and re-tweet each other).
Tomorrow morning I will aim to produce a larger accumulated network with info from the pre-conference events (thanks to Jorge Fábrega for his help on getting that data). Also, I aim to produce the accumulated version and a daily one. Let’s see if we can get something from these dynamic networks. I hope you are enjoying the conference, stay tuned!
UPDATE 3: At 3pm Chicago time, things got much more complex and ‘networked’ (pun intended). At this point there are 167 nodes (ie. Twitter accounts) and 261 edges (defined by mentions, replies or re-tweets. We can observe a big cluster in the middle (in dark orange) where the APSA official account (@APSAtweets), alongside some well recognised political science/tweeters, such as @dandrezner, @raulpacheco, and the recently acquired by the WaPo, @monkeycageblog. Another recognisable cluster (in pink) is the one formed by the Political Communications scholars, such as @zizip, @andrew_chadwick, @davekarpf, and @abuaardvark)
UPDATE 2: This is the network at 12pm. As you can see, the groups are getting bigger and tighter as the conference evolves.
UPDATE: At 9am in Chicago, this is how the network looks like.
Note: Thanks to Alex Hanna for his small – yet crucial – advice on how to build the networks.