Facebook network visualization of +1000 friends
After the initial focus on the Somali dominated economy and social structure of Eastleigh, I hoped to contribute additional material to the project by mapping out the online social networks of Somalis (or at least as many as I could convince to grant me access to their network data). I hoped such material would add a quantitative element to the oral reports of geographically distributed social and economic trust networks prevalent among Somalis in Eastleigh, and give us a way to create interesting, simplified visualizations for complex data sets.
I intended to find out whether online social networks serve as relatively close proxies for real social networks, and if so, whether comparative analysis of several ethnic groups might reveal significant differences in the geographic diversity of Somali social networks relative to other groups. The loose hypothesis being that a wider geographical distribution of the social networks of Somalis living in Eastleigh compared to their non-Somali neighbors could perhaps explain some of their relative economic success, due to wider access to information, economic / trade opportunities and sources of funding.
Brief note on methodology:
Facebook network data was collected using Netvizz, with the resulting gdf file fed into Gephi for visualization and analysis. This was done with the subject present after a short briefing of what data would be collected, and a short demonstration of what can be seen with the resulting data. Subjects were selected from a range of socio-economic backgrounds, though demographic diversity was limited to available facebook users, which tended to be in the 18-34 year-old range. Samples were taken from Somali diaspora living abroad in countries like the US and Sweden, Somalis who fled Somali directly for Kenya and settled in Eastleigh, and Kenyan Somalis who were born and raised in Kenya.
No subjects declined access to their network data or refused permission for the data set to be retained, and all were quite interested in the resulting visualizations. Many reported that they were impressed with the visual structure of their network, were surprised by finding unexpected friend-to-friend linkages during the analysis, and most requested a copy of the resulting visualizations, often so they could share them back on Facebook. There seemed to be a certain pride in seeing their friend networks, with comments such as “this is my universe”, or “my galaxy of friends” common.
Collection of online social network data was seen as a way to avoid problems of weak and unreliable data commonly associated with self-reporting of social network connections. Some newer methods of tracing out personal social networks promise significant improvements, but at the moment are quite time, space and material consuming, and relatively impractical for a brief exploratory study such as this one.
Early assessment of the methodological approach
Although the evidence collected thus far has been quite interesting, I have strong doubts about whether such an approach would yield quantitatively significant evidence to support the original hypothesis.
The first problem comes from selection biases. Are the groups that are significant users of social networks here likely to represent a broad enough cross section of the population here in Eastleigh to tell us anything applicable to the broader community – I would tentatively conclude no, for three reasons:
1) The surprising diversity of the Somali community as a whole.
– Somalis came to Eastleigh in multiple waves over more than a 100 year period, and effort to define what actually constitutes a diaspora are problematic. Cleanly dividing Somali Somalis from Kenyan Somalis, from Issac Somalis, much less clan divisions such as Darood and Hawiye for example will require much more in-depth research and survey work before the various groups’ economic impact can be examined. Simply put, there is too much diversity within the community to make simplistic assumptions about a ‘Somali diaspora’ whose communal and wider social ties can be meaningfully aggregated into a single comparable group that can be measured relative to another Kenyan ethnic group.
- There are also deep questions about geographic dispersion that need to be further understood before any generalizations can be made about the community as a whole. Are Somalis living in the United States more or less significant sources of funds and opportunities than Somalis who have a much longer history in Dubai for example? How could we weight the economic impact of those groups, much less for the dozens of other Somali diaspora communities spread across the globe?
2) The limited sample of the community that are active facebook users
- While Facebook usage seems to be surprisingly widespread, with only 3% of the total Kenyan population using facebook, would this be a realistic way to make broader assumptions about the community as a whole?
- Although the demographic profile of FB users in Kenya is perhaps good for understanding youth and young professional networks (62% of Facebook users are between the ages of 18-34), are these mostly the young and urbanized? There are also questions of gender representation. With only 37% of FB users in Kenya being female, how does this compare with rates of female participation in the Eastleigh economy?
3) The wide diversity of usage behaviors even within the community of active facebook users.
- The sheer range of facebook network sizes encountered has been surprising. Discovering last year that Somali and Kenyan facebook users tended to have much larger average network sizes than Europeans, I was expecting to find a greater degree of fidelity in those networks in terms of identifiable clusters of context (individual networks sub-groups that represent distinct groups of friend interconnections, often denoting geographic separation)
- The meaningful identification of network clusters corresponding to real-life social contexts or connections, however, seems to be a combination of network size and behavior associated with link establishment. For example, on the low-end, typical German social networks are often in the range of 50 – 200 friends. From informal surveys of small networks, I would conclude that little meaningful information about larger social structures, past experiences, deeply connected communities, or social behavior can be generated from networks of less than perhaps 200 connections. There hust isn’t enough community density to tell much about real-world social contexts.
A network of 129 friends. Although sizing nodes relative to betweenness centrality revealed significant friends, modularity partitioning reveals little discernable contextual structure.
- The mid-range of say 300-600 friends seems to generate a good deal of clearly discernable structure, depending on friendship behavior (are the connections those actually known and deemed even somewhat significant in real-life by the user?) These are common, but may be representative of little more than individual idiosyncrasies in friend selection behavior.
Network of 340 friends with clear community clusters and several significant friends identifed by betweenness centrality weighting of node size
- Networks of 1,000 and above seem to lose the clarity of structure, as dense central clusters and high betweenness centrality (typical measures of context and closeness, respectively in the mid-sized networks) are often unrepresentative of meaningful social ties, rather than highly permissive friend acceptance behaviors.
Facebook network of +1100 friends, with large number of unconnected isolates indicating very weak social ties. Even the densely conntected central cluster were largely unknown persons using pseudonyms, and high betweenness centrality was unconnected with actual closeness of social ties
Another network of +1000 friends. More distinguishable community clusters, but densely interconnected central cluster does not represent a homogenous context, and high betweeenness centrality does not correlate with actual close friendships. Creative use of filtering would be needed here to make meanngful conclusions about the ties within this social network
- Even significantly larger networks are reportedly common. Just as an illustration of the diversity of social network ‘friending’ behavior, in a focus group run with a cross section of young Somalis, within a sample size of only ten people, diversity ranged from 50 friends, to more than 14,000 friends spread out over 3 personal accounts.
A focus group of 10 relatively homogenous Somalis, yet facebook network sizes of the participants ranged from 50 friends by one user who was only trying to find an old girlfriend, to more than 14,000 friends spread out over three accounts by a Somali journalist who uses facebook to distribute news and collect source information
- It appears very common to establish facebook connections almost as a casual form of business card exchange – less formal than a phone number exchange, but offering the possibility for social discovery and continued contact. – Multiple accounts specifically tailored for individual audiences are common. For Somali girls especially, one account may be created with a family audience in mind, and another for purely ‘social’ use. This would be a venue where forms of interaction that might be unacceptable to the family (photo sharing, humorous wall posts and exchanges, flirting, etc) would take place. One informant reported that it’s common for some young Somali girls to create many more accounts, “One for a boyfriend, one for girlfriends, one for the family, one for a cousin who’s a notorious snitch, and so on.”
A small network of 90 friends, constructed by a young Somali woman for family use
Recommendations for further research:
This diversity which makes many generalizations impossible, also reveals quite a lot about the identities of Somalis negotiating lives within multiple cultures, and yields rich ethnographic and sociological material on their views about subjects such as appropriate forms of socialization in the digital world, the use of digital information flows, and perceptions of online security.
Although these are all topics that warrant separate discussion, one interesting anecdotal insight to emerge is the nearly inverse perceptions of physical vs. cyber security threats to what is commonly encountered in Germany. In Germany, the physical world is often viewed as relatively low-risk environment, and the digital world full of dangers (threats to privacy, fraud, hacking, identity theft, reputational dangers, potential for the misuse of personal data, etc.), while nearly the opposite is true in Eastleigh. Here the physical world presents a host of dangers to guard against, while the digital world is often viewed as relatively low-risk.
Analyzing how social networks evolve over time could prove insightful (network dynamics is currently a hot area in Social Network Analysis), as current methods such as these offer merely a snapshot of network structure, which may be conditioned by a number of factors previously listed. Observing the evolution and development of these networks over time could help researchers develop more incisive questions about social behavior and the dynamics of knowledge flows in online social networks than is practicable with current methods.
We may be able to learn quite a bit about individual Somalis through the analysis of their online social networks, but for the moment probably best as an adjunct to the more traditional methods of direct oral engagement. In simple terms, when it comes to the bigger questions of global connections, I doubt facebook network analysis in its current form can tell us much that we don’t already know, or couldn’t find out simply by talking to people. Thus I would conclude these sorts of instruments should serve as a compliment to, rather than a replacement for, more traditional methodologies such as interviews, self-reporting of data, surveys and participant observation.
As a final disclaimer, none of this is meant to be an authoritative summary of research findings, but merely some early observations to set up as road markers of a sort, and ideas to be shared with others who might see different weaknesses or opportunities with this methodology.
As always, please feel free to share your thoughts in the comments below.
A brief list of references for further study:
Gephi resources page of SNA study examples from the community of users
An excellent non-technical introduction to the science and theory of networks is: Six Degrees: The Science of a Connected Age, by Duncan Watts
Analyzing Social Media Networks with NodeXL: Insights from a Connected World
Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
Lewis, Kaufman, Gonzalez, Wimmer and Christakis, Tastes, ties, and time: A new social network dataset using Facebook.com, Social Networks, 2008
Bernie Hogan, A comparison of on and offline networks through the Facebook API, December 1, 2008, Oxford Internet Institute, 2008
A good methodology for the development paper-based sociograms is: Hogan, Carrasco and Wellman, Visualizing Personal Networks: Working with Participant-Aided Sociograms, Field Methods 2008
For large-scale social network information collected from mobile phone data, see Eagle, Pentland and Lazer, Inferring friendship network structure using mobile phone data, Proceedings of the National Academy of Sciences, 2009
A sense of the general debate over these new methods can be found in this recent New York Times article on analyzing historical court records, with one critic claiming that for much of data mining, “as yet it’s all method and no results.”