Create geographical plots of your data using Perl, GD, and plot-latlong

Develop custom data plots on world and U.S. maps, including region-shaded data representations

Using world and custom U.S. maps, Perl, GD, and the Cooperative Association for Internet Data Analysis (CAIDA) plot-latlong tool, this article demonstrates how to create your own effective data visualizations in the spirit of Google maps and the U.S. national atlas.

Share:

Nathan Harrington, Programmer, EMC

Nathan Harrington is a programmer at IBM currently working with Linux and resource-locating technologies.



10 April 2007

Also available in Japanese

People and organizations often need data visualization using cartographic representations. Existing mapping tools, such as Google Maps, provide excellent -- if limited -- options. This article uses simple Perl code for geocoding lookups using Yahoo's free geocoding service. With the acquired latitude and longitude, the article will then show how to use the plot-latlong application from CAIDA, along with some annotation using GD, to make useful graphs of nearly limitless options.

After building the hybrid plot-latlong and Perl-GD maps, we'll modify the background map outlines. Using GD's fill and other tools, we'll create a "shadeable" version of a U.S. map in order to represent geographic points, as well as state related quantity data.

The tools and techniques described will allow you to use open source software and freely available geocoding services to create your own customized plots of geographical data. Existing mapping tools APIs may allow you to plot points worldwide, but you might want an easy way to shade in a state, country, or region based on certain data. You also may want to provide plotting maps with customizable indicators for thousands of points instantly. GD, Perl, and plot-latlong are the tools you need to create highly effective maps quickly.

Requirements

Hardware

Any PC manufactured after year 2000 should provide plenty of horsepower for compiling and running the code in this article. CAIDA's plot-latlong and GD will be doing most of the processing, and these are very well-designed and fast programs.

Software

Assuming your operating environment of choice contains a recent version of Perl, you'll need to install the Geocoding and GD Perl modules, as well as CAIDA's plot-latlong. Download and install the GD module from your favorite CPAN mirror. You also need to install the Geo::Coder:Yahoo Perl module and its dependency: Yahoo::Search.

After unpacking the plot-latlong archive, copy the .mapimages directory and the .mapinfo file to your ${HOME} directory. We'll make use of these later in the multipass map generation steps.


Geocoding physical addresses

For our purposes here, geocoding is defined as finding the latitude and longitude for a given address. Most of the useful data maps that can be created with plot-latlong resolve details at the city or suburban area of detail. Given a set of addresses, we will look up their latitude and longitude, then use plot-latlong to display them on a world map. Consider the following list of addresses:

Listing 1. Example locationNames data file
cary, nc
oswego, ny
potsdam, germany
beijing, china
london, england
bangalore, india
brasilia, brazil

The geocoder module and look up service we will use is smart, so it knows we mean the city of Cary, N.C., in the United States, and not a person named Cary in New Caledonia. Consider the following script to process your locationNames data file and create a geocoded data set:

Listing 2. yahooGeoCode.pl
#!/usr/bin/perl -w
# yahooGeoCode.pl - lookup long/lat of an address using yahoo
use strict;

use Geo::Coder::Yahoo;
my $geocoder = Geo::Coder::Yahoo->new(appid => 'my_app' );

while( my $line = <STDIN> )
{
  chomp($line);
  my $location = "";
  $location = $geocoder->geocode( location => "$line" );

  for ( @{$location} )
  {
    my %hash = %{$_};
    print "$hash{latitude} $hash{longitude} # $line \n";
  } 
}#while stdin

Run the geocoder with the command cat locationNames | perl yahooGeoCode.pl >cityCoords. This will give you a file of longitudes, latitudes, and city names. You can use any geocoder you like to produce a file of the format expected to be in cityCoords. The Yahoo/Perl geocoding module from Ask Bjoern Hansen was chosen for this article for its elegance and simplicity.

Listing 3. Example cityCoords data file
35.791458 -78.781174 # cary, nc 
43.455471 -76.510048 # oswego, ny 
52.400002 13.07 # potsdam, germany 
40.25 116.5 # beijing, china 
51.52 -0.1 # london, england 
12.97 77.559998 # bangalore, india 
-15.78 -47.91 # brasilia, brazil

Annotated mapping with plot-latlong

Standard plot-latlong usage

With your geocoded records in the cityCoords file, you can now plot them on a world map with the plot-latlong program. Make sure the plot-latlong.pl program is in your current directory and run the program with the command cat cityCoords | perl plot-latlong.pl -s 5 >cityMap1.png. This will produce a standard plot-latlong-type graph with large rectangular indicators where the cities are.

Annotation of the map

GD can be used to annotate the map image files with title and authorship text, legend descriptions, and fancy borders. However, the real power of plot-latlong is its capability of returning the precise pixel position of the geocoded location. With these pixel coordinates, we can draw lines emanating from the the points, recolor them, connect them, fill the surrounding area with color, or anything else GD can do. For example, we want to build a key at the top of the map image with image identifiers and text. From each box containing the image identifiers and text, a line will be drawn to the associated coordinate on the map. The custom Perl script described in Listing 4 will automate this process. To produce the pixel coordinate required, simply pass the -c option to plot-latlong, and the pixel coordinates will be printed to stderr. Re-run the plotting command: cat cityCoords | perl plot-latlong.pl -s 5 -c >cityMap1.png 2>cityPixels.

Listing 4. worldCompositeMap.pl, Section 1
#!/usr/bin/perl -w
# worldCompositeMap.pl - overlay text, images and shapes on map image
use strict;
use GD;

die "specify: pixelFile dataFile sourceImage destImage" unless @ARGV == 4;

my $newMap = newFromPng GD::Image( $ARGV[2] );

my $imageWidth = $newMap->width();
my $imageHeight = $newMap->height();
my $topMargin = 110;
my $blankMap = new GD::Image($imageWidth, $imageHeight+$topMargin);

my $white = $blankMap->colorAllocate(255,255,255);
my $black = $blankMap->colorAllocate(0,0,0);

$blankMap->copy( $newMap, 0,$topMargin, 0,0, $imageWidth,$imageHeight );

my @pixels = ();
my @data = ();
open( PIXELFILE, "$ARGV[0]" ) or die "can't open pixels file";
  while( my $line = <PIXELFILE> ){ push @pixels, $line }
close(PIXELFILE);

open( DATAFILE, "$ARGV[1]" ) or die "can't open data file";
  while( my $line = <DATAFILE> ){ push @data, $line }
close(DATAFILE);

my $posCount = 0;
my $textX = 25;
my $textY = 20;
my $annotateRowIncrement = 150;

The first section in the worldCompositeMap.pl program checks the command line and sets up the background image. The $blankMap variable is created as a new GD Image the same width as the input image, with extra space on top to hold the annotated information. The black and white colors are then specified to ensure that the image is colored as expected. The original plot-latlong output map is then copied into our blankMap, shifted down 110 pixels.

The next section loads the pixel coordinates into the @pixels array. These pixel values will be processed in Section 2 of the worldCompositeMap.pl code. Loading the @data array with information (place name and annotation image name, in this case) is performed the same way. Note that these files have corresponding sequential entries. Line 1 in the pixels data file specifies the coordinates of the place specified on line 1 in the city data file. Section 2 of the worldCompositeMap.pl program iterates over the data file and annotates the image.

Listing 5. worldCompositeMap.pl, Section 2
for my $dataLine ( @data )
{
  my( undef, undef, $pointX, $pointY) =  split " ", $pixels[$posCount];
  my( $text, $faceImgName ) = split '##', $dataLine;

  $blankMap->stringFT( $black, '/usr/share/fonts/bitstream-vera/Vera.ttf',
    8,0,$textX,$textY, "$text", 
    {   linespacing=>0.6,
        charmap  => 'Unicode',
    });

  $blankMap->rectangle( $textX-10, 5, $textX+120, 100, $black);
  $blankMap->line($textX+50, 100, $pointX, $pointY+$topMargin, $black );

  my $faceImg = newFromPng GD::Image( $faceImgName );
  $blankMap->copyResized( $faceImg, $textX,$textY+10, 0,0, 60,60, 115,115 );

  $textX += $annotateRowIncrement;
  $posCount++;

}#for each data line

open( TILEOUT,"> $ARGV[3]") or die "can't write $ARGV[3]";
  print TILEOUT $blankMap->png;
close(TILEOUT);

For each line in the data array, extract the precise pixel coordinates in the image. Acquire the specified text and annotation image filename from the data array, and write the specified text to the image. After the text has been drawn on the image, put a thin black box around the text and draw a line from the bottom of the box to the associated data point.

Next up is loading and copying a resized version of the annotation image into the box below the specified text. The "current" location is then incremented by the specified length towards the right of the image, and the process is repeated. After writing out the image, the program finishes, producing this output:

Figure 1. worldCompositeMap.pl output example
worldCompositeMap.pl output example

To build the image file shown in Figure 1, run the command perl worldComposite.pl cityPixels locationNames cityMap1.png worldCityMap_done.png.


Delineated body shading and plotting with plot-latlong

General strategy

Shading in the various states on a map of the United States is an effective means of providing another channel of information beyond that available by simply plotting points. The following code and process descriptions show a relatively easy way of modifying the map images and the plot-latlong program to allow you to create useful shaded U.S. maps

Map image modification

For the shading in of a state (or any other delineated body) to work with a simple GD->Fill command, the internal area of the section should be contiguous. If you have ever used the "Paint Bucket" tool in Redmond Paint or The Gimp, you'll be familiar with how this works.

For the state maps in particular, the state borders on the map need to be modified to work well with the fill command. The modifications can be done with a combination of the Edge Enhance effect and pixel-level editing of the state borders. Make sure the upper peninsula of Michigan and the halves of Maryland are connected. You can do these modifications yourself or use the provided image and coordinate file for filling the states. The coordinate file "state.coords" is simply a longitude/latitude pair (usually the state capital) where the GD->Fill command will originate from. To create shadeable maps of specific geographical regions or the entire world map, you'll need to ensure contiguous regions through image editing, as well as developing a coordinates file for each shadeable region.

Configuration file modification

Copy the modified U.S. state image file (statemap.png) from the Downloads section to the ~/.mapimages directory. Next, you'll need to modify the ~/.mapinfo file to let the plot-latlong know about your new map. Copy the following lines into the ~/.mapinfo file:

Listing 6. ~/.mapinfo configuration
MAP statemap statemap.png 50 -125 24 -66
MAP temporary temporary.png 50 -125 24 -66
PROJECTION statemap ALBER 2812.7  30.8 45.5 21.7 -99.9 929 1561
PROJECTION temporary ALBER 2812.7  30.8 45.5 21.7 -99.9 929 1561

Plotting plot-latlong modification

The temporary.png file will be created as the state compositing process progresses. Note that the statemap.png file is based on the USA200.png map. You may find it useful to use smaller maps for shorter processing times if generating on-demand visualizations -- in a Web page, for example. For added differentiation, the plot-latlong program needs to be modified to draw blue circles instead of red boxes. This will provide another level of information and is easy to accomplish using GD. In plot-latlong, comment out the lines in Listing 7 and insert the lines in Listing 8.

Listing 7. plot-latlong squares
    $map->filledRectangle($left_x, $top_y,
        $left_x + $point_size - 1, $top_y + $point_size - 1,
        $red);
Listing 8. plot-latlong circles
    my $blue = $map->colorAllocate(64,64,255);
    $map->filledEllipse( $left_x + ($point_size/2), $top_y + ($point_size/2),
         $point_size, $point_size,
         $blue)

Shading plot-latlong modification

Copy the plot-latlong file to a file called: fillState_plot-latlong. Open this and remove the lines in Listing 8. Insert the lines in Listing 9 in the same spot to fill the contiguous area, instead of drawing ellipses or rectangles.

    my $color = $map->colorAllocate(255, 255 - $value, 8);
    $map->fill( $left_x, $top_y, $color);

You'll also need to modify the line my ($lat, $long) = split /\s+/; to be my ($lat, $long, $value) = split /\s+/;. This one-line change will allow for reading in of the percent value that determines the shading percentage shown above.

Example percentage data file

To effectively use the fillState_plot-latlong code, a suitable data file must be constructed. One simple way of accomplishing this is narrowing the scale of the data to be represented so it all fits in 0-100 values. A value of 0 in this case will produce a more saturated orange, while a value of 100 will be medium orange-yellow. This approach provides relatively intuitive informational content about the quantities in each state, as well as providing reasonable bounds to the informational shading. Listing 9 shows data associated with each state in 0-100 percentages.

Listing 9. example usStatePercs file
32.22434 -86.20379 75 #../../data/coords/AL.MONTGOMERY_
44.33064 -69.72971 98 #../../data/coords/ME.AUGUSTA_
33.51623 -112.02711 7 #../../data/coords/AZ.PHOENIX_
34.72240 -92.35407 11 #../../data/coords/AR.LITTLE_ROCK_
...

Use the provided state.coords file from Resources or create your own list of coordinates inside every state boundary. You can build a sample percentages file like the one in Listing 9 with the following command:

cat state.coords | \
perl -lane '$r=int(rand(99));print "$F[0] $F[1] $r $F[2]"' >usStatesPercs

The usStatesPercs file now contains the coordinates of a location to start the fill, along with the appropriate shading percentage for the associated state. The next task is to build a U.S. state shaded and annotated map with cities plotted with the commands shown in Listing 10. The first command builds the shaded U.S. states map. The second command copies this map to the ~/.mapimages directory to serve as the base map for the next step. Similar to the plotting of the world map above, the succeeding command plots large blue ellipses on the state shaded map, while recording the respective pixel coordinates in the image. Using these pixel coordinates for the shaded plotted map, the image is then annotated with the usual boxes, images, and lines.

Listing 10 Commands to build state shaded, annotated composite map
cat usStatesPercs | perl fillState_plot-latlong -m statemap -s 7 >usStates1.png

cp usStates1.png ~/.mapimages/temporary.png

cat usStatesCoords | \
  perl plot-latlong -m temporary -s 20 -c >usStates2.png 2>usStatesPixels

perl stateCompositeMap.pl usStatesPixels stateData \
  usStates2.png usStatesFinal.png

Output image created from Listing 10 commands:

Figure 2. stateCompositeMap.pl output example
stateCompositeMap.pl output example

The stateCompositeMap.pl program can be downloaded from the Downloads section. stateCompositeMap.pl is very similar to the worldCompositeMap.pl program -- its modifications are focused on aligning the various annotation components to the differently sized base image.


Ideas for further modifications

With these techniques and the geocoding modules and mapping tools, you have myriad options for creating region shaded and city-plotted maps. Consider modifying the plot-latlong data files to read in a variety of factors in addition to data value and location, and you can create many more varieties of maps. Instead of rectangles or ellipses, draw pie charts using GD at specific points on the map. Create connections between the pixel coordinates with GD and fill in the region bounded by your traveling salesperson. Consider alpha-blending radius sweeps around specified cities based on the number of employees in your organization at that point. Use the provided pixel data to create clickable image maps of various regions automatically. Plot your data over time and create a movie out the images with mencoder, etc. plot-latlong and GD offer almost limitless possibilities for data visualization.


Download

DescriptionNameSize
GD plots for this articleos-perlgdPlots_20070223.zip19KB

Resources

Learn

  • CPAN hosts the GD Image Processing Perl module.
  • CAIDA built and hosts the plot-latlong program amonsgst other great tools.
  • Browse all the open source content on developerWorks.
  • To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
  • Stay current with developerWorks' Technical events and webcasts.
  • Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=207294
ArticleTitle=Create geographical plots of your data using Perl, GD, and plot-latlong
publish-date=04102007