Social-networking open source visualization aids

Use Graphviz, the Google Chart API, and CAIDA's plot-latlong tool to analyze your social networks' attributes

Social-networking data analysis can help you understand content, connections, and opportunities for your personal and business associations. This article presents tools and code to extract key components of your social network using the Twitter API to chart, geolocate, and visualize your social-networking data.

This article is a proof-of-concept that shows how to build applications to visualize your interconnections and influence. Graph common subject-matter keywords in your discussions and create geographical maps of your friends' locations. The code presented here relies on Perl, Graphviz, the Cooperative Association for Internet Data Analysis (CAIDA) plot-latlong, and the Google Chart API to create helpful visualizations to analyze your social networks.

Hardware and software requirements

Any PC manufactured after 2000 should provide plenty of horsepower for compiling and running the code here. As of this writing, CAIDA's plot-latlong tool requires a UNIX®-like operating system for geographical map creation. The other visualizations are made using curl and Graphviz, which are available for a wider variety of platforms.

You need Perl and the XML::Simple, Geo::Coder::Yahoo, and GD Perl modules, which process the social-networking data. A good image viewer, such as feh, is also recommended. To manipulate user images into a standard PNG format, the "convert" component of ImageMagick is required. See Resources for information on where to find these programs.

To install these applications on a Debian-based distribution of Linux®, such as Ubuntu, enter the following command in a terminal window: sudo apt-get install perl feh imagemagick curl graphviz. You need to download plot-latlong manually. After unpacking the plot-latlong archive, copy the .mapimages directory and the .mapinfo file to your ${HOME} directory.

Although this article demonstrates the code on Linux, the data gathering and processing code can be adapted easily to work on any platform that supports Perl, such as Microsoft® Windows®.


Extracting social-network data using the Twitter API

Twitter's RESTful interface and clear API documentation provide excellent methods for you to access social-networking attributes. See Resources for more information about the Twitter API. Listing 1 shows the initial buildViz.pl program setup.

Listing 1. buildViz.pl, Part 1
#!/usr/bin/perl -w
# buildViz.pl create social networking visualizations
use strict;
use XML::Simple;

die "specify searchUser, username, password, mode " unless @ARGV == 4;
my( $search, $user, $pass, $mode ) = @ARGV;

my $cmd = "mkdir xml/; mkdir img/";
system( $cmd ) unless( -d "xml" && -d "img" );

# get user's profile data
$cmd  = qq{ curl -u $user:$pass "http://twitter.com/users/show/$user.xml" };
$cmd .= qq{ > xml/$user.xml };

system( $cmd ) unless( -e "xml/$user.xml" );

  # get profile image
  my $xmlImg = XMLin( "xml/$user.xml" );
  my $imgUrl = $xmlImg->{profile_image_url};
  $cmd  = qq{ curl "$imgUrl" > img/$user.png ; };
  $cmd .= qq{ convert -format png img/$user.png img/$user.png };
  system( $cmd ) unless( -e "img/$user.png" );

# get users' friends (people that user is following)
$cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/friends.xml" };
$cmd .= qq{ > xml/$user.friends.xml };
system( $cmd ) unless( -e "xml/$user.friends.xml" );

After specifying the required modules and Twitter API credentials, directories are created and the XML for the specified user is retrieved. Note that you can create visualizations for any Twitter user who does not protect his updates. Good form requires that the XML files only be retrieved once, so each XML file will be retrieved if it does not exist on the local filesystem. You'll need to delete these files manually if the most recent data is required.

Next, the image for the specified user is downloaded, along with a list of that users' friends. In concert with the Twitter API documentation, this article uses the terms "friends" and "people you are following" interchangeably. Listing 2 continues the retrieval of friends for the specified users friends.

Listing 2. buildViz.pl, Part 2
my $xmlFriend = XMLin( "xml/$user.friends.xml" );

for my $name ( keys %{ $xmlFriend->{user} } )
{
  my $userFr = $xmlFriend->{user}->{$name}->{screen_name};

  # get friends' friends
  $cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/friends/};
  $cmd .= qq{$userFr.xml?page=1" > xml/$userFr.friends.xml};
  system( $cmd ) unless( -e "xml/$userFr.friends.xml" );

  # get friends most recent 200 tweets
  $cmd  = qq{ curl -u $user:$pass "http://twitter.com/statuses/user_timeline/};
  $cmd .= qq{$userFr.xml?count=200" > xml/$userFr.user_timeline.xml};
  system( $cmd ) unless( -e "xml/$userFr.user_timeline.xml" );

  # get friends image (requires imagemagick convert)
  my $imgUrl = $xmlFriend->{user}->{$name}->{profile_image_url};
  $cmd  = qq{ curl "$imgUrl" > img/$userFr.png ; };
  $cmd .= qq{ convert -format png img/$userFr.png img/$userFr.png };
  system( $cmd ) unless( -e "img/$userFr.png" );

}#for each friend

As you review your social-networking connection, you may find that your friends share many friends. The unless ( -e sections help reduce the burden on Twitter's servers by only retrieving unique XML files.

In addition to the "friends of friends" list, each friend's timeline is retrieved, along with that friend's profile image. Save the contents of Listing 1 and 2 as the file buildViz.pl and type the command perl buildViz.pl searchUser yourUserName yourPassword retrieve. In this case, searchUser is the username of the Twitter user whose social-networking data you want to retrieve. yourUserName and yourPassword are your authentication credentials, and retrieve is a placeholder to specify XML downloads only.

The buildViz.pl program will create the img and xml subdirectories, and fill them with files like that shown below.

Listing 3. Example img/ xml/ directories
 87953 2008-11-26 08:21 xml/agberg.friends.xml
187263 2008-11-26 08:21 xml/agberg.user_timeline.xml
 85451 2008-11-26 08:23 xml/alphaworks.friends.xml
 50967 2008-11-26 08:23 xml/alphaworks.user_timeline.xml
 85854 2008-11-26 08:21 xml/andysc.friends.xml
163570 2008-11-26 08:21 xml/andysc.user_timeline.xml
 83236 2008-11-26 08:23 xml/BillHiggins.friends.xml
177740 2008-11-26 08:23 xml/BillHiggins.user_timeline.xml
...
  5626 2008-11-26 08:21 img/agberg.png
  5753 2008-11-26 08:23 img/alphaworks.png
  2080 2008-11-26 08:21 img/andysc.png
  4527 2008-11-26 08:23 img/BillHiggins.png

Developing interconnections data and visualization using Graphviz

One method to measure a particular user's influence on their friends is to measure the number of friends that user has. In theory, users with fewer friends have more time to follow social-networking updates and respond to questions. Add the contents of Listing 4 at line 53 in buildViz.pl.

Listing 4. visualizeInfluence subroutine
visualizeInfluence() if( $mode eq "influence" );

### begin subroutines

sub visualizeInfluence
{
  my %frHash = ();
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userFr = $xmlFriend->{user}->{$name}->{screen_name};
    my $xmlSec = XMLin( "xml/$userFr.friends.xml" );

    $frHash{ $userFr } = 0;
    for my $linkUser( keys %{ $xmlSec->{user} } ){  $frHash{$userFr}++  }

  }#for each friend

  my $infList = "1 $user\n";
  for my $name ( sort {$frHash{$a} <=> $frHash{$b}} keys %frHash )
  {
    $infList .= "$frHash{$name} $name\n";
    last if( ($infList =~ s/\n/\n/g) == 15 );  # exit after fifteen lines

  }# for each key sorted

  chop($infList); # remove last newline
  $cmd  = qq{ echo "$infList" | perl twitdot.pl $user img > influence.fdp ; };
  $cmd .= qq{ fdp influence.fdp -Tpng -o graphviz_influence.png };

  system($cmd);

}#visualizeInfluence

Each friends list of friends is counted, and the top 15 "influence-able" friends are added to the $infList variable. These count, and friend name combinations are passed as input to the twitdot.pl program. Based on code from the "Explore relationships among Web pages visually" article, the twitdot.pl program generates fdp graph-generation syntax for Graphviz. Consult the article and the code Download section for more information about the modifications necessary for this particular visualization.

Next, fdp is called with the fdp graph syntax file to generate the visualization. Run the program with the command perl buildViz.pl searchUser yourUserName yourPassword influence and view the output file (graphviz_influence.png) in your favorite image viewer. Figure 1 shows an example of what this can look like.

Figure 1. Example graphviz_influence.png
Example graphviz_influence.png

The width and color of the arrows indicate the "influence-ability" of each of the friends, based on the number of friends they have.


Developing keyword data and visualization using the Google chart API

Influence has been measured, but what about content? Add the code shown in Listing 5 at line 87 in buildViz.pl to create a chart showing the most commonly used words in your message history.

Listing 5. visualizeKeywords subroutine
sub visualizeKeywords
{
  my %wordHash = ();
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userFr = $xmlFriend->{user}->{$name}->{screen_name};
    my $xmlSec = XMLin( "xml/$userFr.user_timeline.xml" );

    for my $linkUser(  keys %{ $xmlSec->{status} }  )
    {
      my $msgText = $xmlSec->{status}->{$linkUser}->{text};
      for my $key( split " ", lc($msgText) ){  $wordHash{$key}++  }

    }#for each text update 

  }#for each friend

  my $tStr = "";
  my $chlStr = "";
  for my $word ( sort {$wordHash{$b} <=> $wordHash{$a}} keys %wordHash )
  {
    next unless( length($word) > 10 );    # only print 'long' entries
    $tStr .= "$wordHash{$word},";         # append url data
    $chlStr .= "$word|";                  # append url labels
    last if( ($tStr =~ s/,/,/g) == 10 );  # exit loop after first ten words

  }#for the top words

  chop($tStr); chop($chlStr);  # remove trailing delimiters

  $cmd  = qq{ curl "http://chart.apis.google.com/chart?cht=p&chd=t:$tStr};
  $cmd .= qq{&chs=1000x300&chl=$chlStr" > chart_keywords.png };
  system($cmd);

}#visualizeKeywords

Each word from each of your friends' timelines is recorded in the %wordHash variable. To measure some of the more significant verbiage, a minimum length of 10 is required for the word to be graphed. The top 10 words meeting these requirements and their frequency counts are then packed into a URL for generation using the Google Chart API. Check the Resources section for more information about the URL formats and the options available with Google Charts.

Add the subroutine call shown below to buildViz.pl at line 54.

Listing 6. visualizeKeywords logic call
visualizeKeywords()  if( $mode eq "keywords"  );

Run the keyword visualization with the command perl buildViz.pl searchUser yourUserName yourPassword keywords. View the output chart_keywords.png file with your image viewer. Figure 2 demonstrates what this can look like.

Figure 2. Example chart_keywords.png
Example chart_keywords.png

Developing geolocated data and visualization using plot-latlong

After charting who can be influenced and what is being said, we can move on to visualizing where in the world these people are. Add the code shown in Listing 7 at line 125 in buildViz.pl.

Listing 7. visualizeLocations subroutine
sub visualizeLocations
{
  use Geo::Coder::Yahoo;
  my $geocoder = Geo::Coder::Yahoo->new(appid => 'my_app' );

  open( LOCOUT, ">locationNames" ) or die "no locationNames out\n";
  open( COORDS, ">cityCoords" )    or die "no cityCoords out \n";

  # record all friends geographical locations
  my $xmlFriend = XMLin( "xml/$user.friends.xml" );
  for my $name ( keys %{ $xmlFriend->{user} } )
  {
    my $userLoc = $xmlFriend->{user}->{$name}->{location};
    my $imgName = $xmlFriend->{user}->{$name}->{screen_name};
    my $location = $geocoder->geocode( location => "$userLoc" );

    for my $coords( @{$location} )
    {
      my %hashRef = %{ $coords };
      print "$hashRef{latitude} $hashRef{longitude} # $userLoc\n";
      print COORDS "$hashRef{latitude} $hashRef{longitude} # $userLoc\n";
      print LOCOUT "$userLoc ##$imgName.png\n";

    }#for coordinates returned  

  }#for each friend

  close( COORDS ); close( LOCOUT );

  # draw the map
  $cmd  = qq{ cat cityCoords | perl plot-latlong -s 5 -c };
  $cmd .= qq{ > cityMap.png 2>cityPixels };
  system( $cmd );

  # Annotate the map with the first 7 friends information
  $cmd  = qq{ head -n7 locationNames > 7.locationNames ; };
  $cmd .= qq{ head -n7 cityPixels > 7.cityPixels ; };
  $cmd .= qq{ perl worldCompositeMap.pl 7.cityPixels 7.locationNames };
  $cmd .= qq{ cityMap.png worldCityMap_annotated.png };
  system($cmd);

}#visualizeLocations

Again making use of prior developerWorks-published code, the worldCompositeMap.pl program is detailed in "Create geographical plots of your data using Perl, GD, and plot-latlong." Using the excellent Geo::Coder::Yahoo module, it's relatively easy to record the city coordinates for your friends' locations in the cityCoords file, and the associated name and image data in the locationNames file.

The first seven friends' locations and identifiers are then passed to the worldCompositeMap.pl for rendering. Consult the article link above or the Download section for more information about the worldCompositeMap.pl program.

Add the subroutine call shown in Listing 8 at line 55 in buildViz.pl.

Listing 8. visualizeLocations logic call
visualizeLocations() if( $mode eq "locations" );

Run the command perl buildViz.pl searchUser yourUserName yourPassword locations to build the worldCityMap_annotated.png file, and open that file in your image viewer. Figure 3 is an example of what this can look like.

Figure 3. Example worldCityMap_annotated.png
Example worldCityMap_annotated.png

Conclusion, further examples

With the code and tools presented here, you can create a variety of visualizations to help analyze attributes of your social network. Use these tools to track keywords as they spread through your network of friends. Visualize the paths of particular links as they travel to different areas of activity around the world. Help create charts and analysis for your employers to help them see the deep value of social networking.


Download

DescriptionNameSize
Sample codeos-socialtoolstwitterVisualizations.0.1.zip---

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=361816
ArticleTitle=Social-networking open source visualization aids
publish-date=01062009