Skip to main content

Social-networking open source visualization aids

Use Graphviz, the Google Chart API, and CAIDA's plot-latlong tool to analyze your social networks' attributes

Nathan Harrington, Programmer, IBM
Nathan Harrington
Nathan Harrington is a programmer working with Linux at IBM. You can find more information about him at nathanharrington.info.

Summary:  Social-networking data analysis can help you understand content, connections, and opportunities for your personal and business associations. This article presents tools and code to extract key components of your social network using the Twitter API to chart, geolocate, and visualize your social-networking data.

Date:  06 Jan 2009
Level:  Intermediate
Activity:  3006 views

This article is a proof-of-concept that shows how to build applications to visualize your interconnections and influence. Graph common subject-matter keywords in your discussions and create geographical maps of your friends' locations. The code presented here relies on Perl, Graphviz, the Cooperative Association for Internet Data Analysis (CAIDA) plot-latlong, and the Google Chart API to create helpful visualizations to analyze your social networks.

Hardware and software requirements

Any PC manufactured after 2000 should provide plenty of horsepower for compiling and running the code here. As of this writing, CAIDA's plot-latlong tool requires a UNIX®-like operating system for geographical map creation. The other visualizations are made using curl and Graphviz, which are available for a wider variety of platforms.

You need Perl and the XML::Simple, Geo::Coder::Yahoo, and GD Perl modules, which process the social-networking data. A good image viewer, such as feh, is also recommended. To manipulate user images into a standard PNG format, the "convert" component of ImageMagick is required. See Resources for information on where to find these programs.

To install these applications on a Debian-based distribution of Linux®, such as Ubuntu, enter the following command in a terminal window: sudo apt-get install perl feh imagemagick curl graphviz. You need to download plot-latlong manually. After unpacking the plot-latlong archive, copy the .mapimages directory and the .mapinfo file to your ${HOME} directory.

Although this article demonstrates the code on Linux, the data gathering and processing code can be adapted easily to work on any platform that supports Perl, such as Microsoft® Windows®.


Extracting social-network data using the Twitter API

Twitter's RESTful interface and clear API documentation provide excellent methods for you to access social-networking attributes. See Resources for more information about the Twitter API. Listing 1 shows the initial buildViz.pl program setup.


Listing 1. buildViz.pl, Part 1

#!/usr/bin/perl -w
# buildViz.pl create social networking visualizations
use strict;
use XML::Simple;

die "specify searchUser, username, password, mode " unless @ARGV == 4;
my( $search, $user, $pass, $mode ) = @ARGV;

my $cmd = "mkdir xml/; mkdir img/";
system( $cmd ) unless( -d "xml" && -d "img" );

# get user's profile data
$cmd  = qq{ curl -u $user:$pass "http://twitter.com/users/show/$user.xml" };
$cmd .= qq{ > xml/$user.xml };

system( $cmd ) unless( -e "xml/$user.xml" );

  # get profile image
  my $xmlImg = XMLin( "xml/$user.xml" );
  my $imgUrl = $xmlImg->{profile_image_url};
  $cmd  = qq{ curl "$imgUrl" > img/$user.png ; };
  $cmd .= qq{ convert -format png img/$user.png img/$user.png };
  system( $cmd ) unless( -e "img/$user.png" );

# get users' friends (people that user is following)
$cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/friends.xml" };
$cmd .= qq{ > xml/$user.friends.xml };
system( $cmd ) unless( -e "xml/$user.friends.xml" );

After specifying the required modules and Twitter API credentials, directories are created and the XML for the specified user is retrieved. Note that you can create visualizations for any Twitter user who does not protect his updates. Good form requires that the XML files only be retrieved once, so each XML file will be retrieved if it does not exist on the local filesystem. You'll need to delete these files manually if the most recent data is required.

Next, the image for the specified user is downloaded, along with a list of that users' friends. In concert with the Twitter API documentation, this article uses the terms "friends" and "people you are following" interchangeably. Listing 2 continues the retrieval of friends for the specified users friends.


Listing 2. buildViz.pl, Part 2

my $xmlFriend = XMLin( "xml/$user.friends.xml" );

for my $name ( keys %{ $xmlFriend->{user} } )
{
  my $userFr = $xmlFriend->{user}->{$name}->{screen_name};

  # get friends' friends
  $cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/friends/};
  $cmd .= qq{$userFr.xml?page=1" > xml/$userFr.friends.xml};
  system( $cmd ) unless( -e "xml/$userFr.friends.xml" );

  # get friends most recent 200 tweets
  $cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/user_timeline/};
  $cmd .= qq{$userFr.xml?count=200" > xml/$userFr.user_timeline.xml};
  system( $cmd ) unless( -e "xml/$userFr.user_timeline.xml" );

  # get friends image (requires imagemagick convert)
  my $imgUrl = $xmlFriend->{user}->{$name}->{profile_image_url};
  $cmd  = qq{ curl "$imgUrl" > img/$userFr.png ; };
  $cmd .= qq{ convert -format png img/$userFr.png img/$userFr.png };
  system( $cmd ) unless( -e "img/$userFr.png" );

}#for each friend

As you review your social-networking connection, you may find that your friends share many friends. The unless ( -e sections help reduce the burden on Twitter's servers by only retrieving unique XML files.

In addition to the "friends of friends" list, each friend's timeline is retrieved, along with that friend's profile image. Save the contents of Listing 1 and 2 as the file buildViz.pl and type the command perl buildViz.pl searchUser yourUserName yourPassword retrieve. In this case, searchUser is the username of the Twitter user whose social-networking data you want to retrieve. yourUserName and yourPassword are your authentication credentials, and retrieve is a placeholder to specify XML downloads only.

The buildViz.pl program will create the img and xml subdirectories, and fill them with files like that shown below.


Listing 3. Example img/ xml/ directories

 87953 2008-11-26 08:21 xml/agberg.friends.xml
187263 2008-11-26 08:21 xml/agberg.user_timeline.xml
 85451 2008-11-26 08:23 xml/alphaworks.friends.xml
 50967 2008-11-26 08:23 xml/alphaworks.user_timeline.xml
 85854 2008-11-26 08:21 xml/andysc.friends.xml
163570 2008-11-26 08:21 xml/andysc.user_timeline.xml
 83236 2008-11-26 08:23 xml/BillHiggins.friends.xml
177740 2008-11-26 08:23 xml/BillHiggins.user_timeline.xml
...
  5626 2008-11-26 08:21 img/agberg.png
  5753 2008-11-26 08:23 img/alphaworks.png
  2080 2008-11-26 08:21 img/andysc.png
  4527 2008-11-26 08:23 img/BillHiggins.png


Developing interconnections data and visualization using Graphviz

One method to measure a particular user's influence on their friends is to measure the number of friends that user has. In theory, users with fewer friends have more time to follow social-networking updates and respond to questions. Add the contents of Listing 4 at line 53 in buildViz.pl.


Listing 4. visualizeInfluence subroutine

visualizeInfluence() if( $mode eq "influence" );

### begin subroutines

sub visualizeInfluence
{
  my %frHash = ();
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userFr = $xmlFriend->{user}->{$name}->{screen_name};
    my $xmlSec = XMLin( "xml/$userFr.friends.xml" );

    $frHash{ $userFr } = 0;
    for my $linkUser( keys %{ $xmlSec->{user} } ){  $frHash{$userFr}++  }

  }#for each friend

  my $infList = "1 $user\n";
  for my $name ( sort {$frHash{$a} <=> $frHash{$b}} keys %frHash )
  {
    $infList .= "$frHash{$name} $name\n";
    last if( ($infList =~ s/\n/\n/g) == 15 );  # exit after fifteen lines

  }# for each key sorted

  chop($infList); # remove last newline
  $cmd  = qq{ echo "$infList" | perl twitdot.pl $user img > influence.fdp ; };
  $cmd .= qq{ fdp influence.fdp -Tpng -o graphviz_influence.png };

  system($cmd);

}#visualizeInfluence

Each friends list of friends is counted, and the top 15 "influence-able" friends are added to the $infList variable. These count, and friend name combinations are passed as input to the twitdot.pl program. Based on code from the "Explore relationships among Web pages visually" article, the twitdot.pl program generates fdp graph-generation syntax for Graphviz. Consult the article and the code Download section for more information about the modifications necessary for this particular visualization.

Next, fdp is called with the fdp graph syntax file to generate the visualization. Run the program with the command perl buildViz.pl searchUser yourUserName yourPassword influence and view the output file (graphviz_influence.png) in your favorite image viewer. Figure 1 shows an example of what this can look like.


Figure 1. Example graphviz_influence.png
Example graphviz_influence.png

The width and color of the arrows indicate the "influence-ability" of each of the friends, based on the number of friends they have.


Developing keyword data and visualization using the Google chart API

Influence has been measured, but what about content? Add the code shown in Listing 5 at line 87 in buildViz.pl to create a chart showing the most commonly used words in your message history.


Listing 5. visualizeKeywords subroutine

sub visualizeKeywords
{
  my %wordHash = ();
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userFr = $xmlFriend->{user}->{$name}->{screen_name};
    my $xmlSec = XMLin( "xml/$userFr.user_timeline.xml" );

    for my $linkUser(  keys %{ $xmlSec->{status} }  )
    {
      my $msgText = $xmlSec->{status}->{$linkUser}->{text};
      for my $key( split " ", lc($msgText) ){  $wordHash{$key}++  }

    }#for each text update 

  }#for each friend

  my $tStr = "";
  my $chlStr = "";
  for my $word ( sort {$wordHash{$b} <=> $wordHash{$a}} keys %wordHash )
  {
    next unless( length($word) > 10 );    # only print 'long' entries
    $tStr .= "$wordHash{$word},";         # append url data
    $chlStr .= "$word|";                  # append url labels
    last if( ($tStr =~ s/,/,/g) == 10 );  # exit loop after first ten words

  }#for the top words

  chop($tStr); chop($chlStr);  # remove trailing delimiters

  $cmd  = qq{ curl "http://chart.apis.google.com/chart?cht=p&chd=t:$tStr};
  $cmd .= qq{&chs=1000x300&chl=$chlStr" > chart_keywords.png };
  system($cmd);

}#visualizeKeywords

Each word from each of your friends' timelines is recorded in the %wordHash variable. To measure some of the more significant verbiage, a minimum length of 10 is required for the word to be graphed. The top 10 words meeting these requirements and their frequency counts are then packed into a URL for generation using the Google Chart API. Check the Resources section for more information about the URL formats and the options available with Google Charts.

Add the subroutine call shown below to buildViz.pl at line 54.


Listing 6. visualizeKeywords logic call

visualizeKeywords()  if( $mode eq "keywords"  );

Run the keyword visualization with the command perl buildViz.pl searchUser yourUserName yourPassword keywords. View the output chart_keywords.png file with your image viewer. Figure 2 demonstrates what this can look like.


Figure 2. Example chart_keywords.png
Example chart_keywords.png


Developing geolocated data and visualization using plot-latlong

After charting who can be influenced and what is being said, we can move on to visualizing where in the world these people are. Add the code shown in Listing 7 at line 125 in buildViz.pl.


Listing 7. visualizeLocations subroutine

sub visualizeLocations
{
  use Geo::Coder::Yahoo;
  my $geocoder = Geo::Coder::Yahoo->new(appid => 'my_app' );

  open( LOCOUT, ">locationNames" ) or die "no locationNames out\n";
  open( COORDS, ">cityCoords" )    or die "no cityCoords out \n";

  # record all friends geographical locations
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userLoc = $xmlFriend->{user}->{$name}->{location};
    my $imgName = $xmlFriend->{user}->{$name}->{screen_name};
    my $location = $geocoder->geocode( location => "$userLoc" );

    for my $coords( @{$location} )
    {
      my %hashRef = %{ $coords };
      print "$hashRef{latitude} $hashRef{longitude} # $userLoc\n";
      print COORDS "$hashRef{latitude} $hashRef{longitude} # $userLoc\n";
      print LOCOUT "$userLoc ##$imgName.png\n";

    }#for coordinates returned  

  }#for each friend

  close( COORDS ); close( LOCOUT );

  # draw the map
  $cmd  = qq{ cat cityCoords | perl plot-latlong -s 5 -c };
  $cmd .= qq{ > cityMap.png 2>cityPixels };
  system( $cmd );

  # Annotate the map with the first 7 friends information
  $cmd  = qq{ head -n7 locationNames > 7.locationNames ; };
  $cmd .= qq{ head -n7 cityPixels > 7.cityPixels ; };
  $cmd .= qq{ perl worldCompositeMap.pl 7.cityPixels 7.locationNames };
  $cmd .= qq{ cityMap.png worldCityMap_annotated.png };
  system($cmd);

}#visualizeLocations

Again making use of prior developerWorks-published code, the worldCompositeMap.pl program is detailed in "Create geographical plots of your data using Perl, GD, and plot-latlong." Using the excellent Geo::Coder::Yahoo module, it's relatively easy to record the city coordinates for your friends' locations in the cityCoords file, and the associated name and image data in the locationNames file.

The first seven friends' locations and identifiers are then passed to the worldCompositeMap.pl for rendering. Consult the article link above or the Download section for more information about the worldCompositeMap.pl program.

Add the subroutine call shown in Listing 8 at line 55 in buildViz.pl.


Listing 8. visualizeLocations logic call

visualizeLocations() if( $mode eq "locations" );

Run the command perl buildViz.pl searchUser yourUserName yourPassword locations to build the worldCityMap_annotated.png file, and open that file in your image viewer. Figure 3 is an example of what this can look like.


Figure 3. Example worldCityMap_annotated.png
Example worldCityMap_annotated.png


Conclusion, further examples

With the code and tools presented here, you can create a variety of visualizations to help analyze attributes of your social network. Use these tools to track keywords as they spread through your network of friends. Visualize the paths of particular links as they travel to different areas of activity around the world. Help create charts and analysis for your employers to help them see the deep value of social networking.



Download

DescriptionNameSizeDownload method
Sample codeos-socialtoolstwitterVisualizations.0.1.zip HTTP

Information about download methods


Resources

Learn

Get products and technologies

Discuss

About the author

Nathan Harrington

Nathan Harrington is a programmer working with Linux at IBM. You can find more information about him at nathanharrington.info.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=361816
ArticleTitle=Social-networking open source visualization aids
publish-date=01062009
author1-email=harrington.nathan_@gmail.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers