Cultured Perl: Practical Twitter with Perl

Learn to use the CPAN Net::Twitter module to access the Twitter API from Perl

Learn how to access the features of the Twitter API using the CPAN Net::Twitter module. You'll also see some solid business uses for Twitter, including automated posting and analyzing Twitter search results.

Teodor Zlatanov, Programmer, Gold Software Systems

photo- teodor zlatanovTeodor Zlatanov emerged with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing, database architectures, user interfaces, and UNIX system administration.



08 December 2009

Also available in Japanese Portuguese

You probably know that Twitter is an incredibly-cool-but-not-obvious-how-to-profit-from messaging service, that it limits messages to 140 characters, and that it is to technology as diamonds are to coal. Twitter's appeal is that the messages are short and people may actually pay attention to them. Two general suggestions at the outset:

  • Don't post 139-character messages every time.
  • Don't send 20 messages a day.

That's annoying and annoyance is the sure way to get dropped by your followers.

Twitter can be useful in a business setting. Like any messaging service (e-mail anyone?) it can be used for many purposes. Business use of Twitter, however, often concentrates on the "hello it's me" kind of messaging rather than the "useful information" kind. See, for example, Chris Brogan's list of 50 ideas for Twitter use in business (see Resources); you might also have heard about examples set by Jet Blue and Dell, among others.

This article explores automated Twitter posts and searches that make sense for a business rather than a live human posting for a business. The latter is not all that interesting, technically, (although it would be fascinating to hook up an AI that pretends to be human and responds to Twitter posts).

You should already know how to install a CPAN module (Net::Twitter, specifically), but this article is intended for beginning Perl programmers. See the Resources section if you need help with CPAN.

Finally, I do not use "tweets" to refer to Twitter posts except where forced to do so by the API language. Sorry if you like the word "tweets"; I just think it will make this article clearer to avoid it.

Enough preamble; let's start building an application.

What's the goal?

What kind of information will we broadcast? Let's imagine a company that uses Perl and doesn't just want to use Twitter because they heard about it on the news; they weighed other options, got serious about Twitter, and have carefully considered what to post.

The company name is The Cultured Perl, Ltd. and they've created a new Twitter account under http://twitter.com/cultured_perl.

Imagine the excitement. Legal has signed off on it and Marketing is drooling. You, as lead developer, want to get things done, so you decide to try just two things on Twitter to see how well it works without yet worrying about all the Great Questions.

First, you'll search Twitter for references to "perl" and see if people are talking about anything interesting regarding Perl that might give the company ideas for products.

Second, you'll set up automatic posting for hiring announcements (you briefly consider posting firing announcements too, but decide against it because that's tacky).

Searching Twitter for keywords

Twitter as a service is easy to use. The company that runs it has to worry about all the details—server utilization, network bills, dead hard drives, and so on. You just open a TCP/IP connection, ask a question, and get an answer back. This is a hallmark of Web-based services and nice for users, but it's good to keep in mind that a keyword search may be expensive on the server side if you do it repeatedly (especially if the keywords vary). Try not to hammer Twitter's servers with automated queries more often than absolutely necessary.

The twitter_do.pl script will do a Twitter search and a post (twitter_do.pl is discussed in the next section and is included in the Downloads section, below). Note that you don't need a password to search but you do to post.

The twitter_do.pl script uses Getopt::Long to get command-line arguments, as summarized in the usage() function. The script includes a help screen to save the user the trouble of figuring out the options on his own.

Note also how it uses a --verbose option, how every option has a shortcut, and how the detailed printout of a search result shows undefined strings as (UNDEFINED) instead of just omitting them. These are little things that make users happy in the long run.

Listing 1. twitter_do.pl: Nice usage help
sub usage
{
 return <<EOHIPPUS

 $0 [OPTIONS] SEARCH1 [SEARCH2 ...]

Note that you can search without a valid login.

Options:
 --help or -h                    : this help
 --verbose or -v                 : print more verbose output
 --rpp=100 or -r 100             : 100 results per page
 --maxresults=100 or -n 100      : return at most 100 search matches
 --popularity or -y              : analyze user popularity for search results
 --post=NEWS or -o NEWS          : post on Twitter
 --username=NAME or -u NAME      : specify the Twitter username
 --passwordfile=FILE or -pf FILE : specify a file that contains
                                   the Twitter password
 --password=PASS or -p PASS      : specify the Twitter password
EOHIPPUS
}

Twitter search results are returned in a hash reference with a page count and other fields besides the search results themselves. We grab just the results, add them to the full list of results, and then do another search (displaying the next page of results) until either we get enough results (according to the --maxresults option) or get no results. The results are already sorted by date so we keep them that way.

Unfortunately we can't just count the number of results, we have to step through each page, causing Twitter unpleasant multiple hits on the search term. You can tune this with the --rpp option. Similarly, if we get more results than --maxresults:

Listing 2. Searching Twitter by keyword
sub do_search
{
 my $term = shift @_;

 my $page = 1;
 my @results;

 while (scalar @results < $opts{maxresults})
 {
  my $rset = $handle->search({query=>$term, page => $page, rpp => $opts{rpp} });
  print "Searching for $term (page $page)\n" if $opts{verbose};
  if (ref $rset eq 'HASH' && exists $rset->{results})
  {
   # break out if no results came back
   last unless @{$rset->{results}};
   push @results, @{$rset->{results}};
   printf "Now we have %d entries\n", scalar @results if $opts{verbose};
  }

  # go to the next page
  $page++;
 }

 print_post($_) foreach @results;
}
Listing 3. Printing search matches
sub print_post
{
 my $t = shift @_;
 
 printf("%s (on %s)\n\t%s\n", $t->{from_user}, $t->{created_at}, $t->{text});

 if ($opts{verbose})
 {
  foreach my $key (sort keys %$t)
  {
   my $v = $t->{$key};
   $v = '(UNDEFINED)' unless defined $v;
   print "...$key=$v\n";
  }
 }
}

There's nothing interesting about the print_post function, so instead take a moment to breathe deeply and relax. Become one with the Universe. Ahhh. Now go drink something strong, it's time to talk about posting.


Posting on Twitter

We'll use a password file command-line option in the script. You can either put the password in a file and specify it with the -pf option or specify the password directly with -p. For security it's better to do the former, but the latter is more convenient.

Listing 4. Posting on Twitter: The password options
if (exists $opts{passwordfile} && !exists $opts{password} )
{
 open PF, "<", $opts{passwordfile} 
  or die "Couldn't open $opts{passwordfile}: $!";
 $opts{password} = <PF>;
 close PF;
 chomp $opts{password};
}

# require the password AND at least one search term OR a post
die usage() 
 unless (defined $opts{password} && exists $opts{post}) || scalar @ARGV;

Annoyingly, passwords with spaces don't work. Stick to characters that are safe in URLs (letters, numbers, dash, underscore: [A-Za-z0-9_-]).

After posting, we check if the latest post has the same text as what we intended to post. Easy, right?

Listing 5. Posting on Twitter: The text check
sub do_post
{
 my $post = shift @_;

 $post = '(UNDEFINED)' unless defined $post;
 my $ret = $handle->update({status => $post});
 warn "Could not post the update" unless defined $ret;
 if ($ret->{text} eq $post)
 {
  print "Successfully posted [$post].\n";
 }
 else
 {
  warn "Posted string [$ret->{text}] is different from given [$post]";
 }
}

Linking the posting command to the company jobs database is easy: do a quick SQL query (something like SELECT desc,salary FROM jobs WHERE created > yesterday AND salary NOT NULL AND desc LIKE '%perl%' (excuse the pseudo-SQL) and the results can be fed right into twitter_do.pl.


Analyzing Twitter search results

The Cultured Perl, Ltd. has great success with its new Twitter-centric strategy. They hire three people and have 1,800 followers within days—job well done!

And yet ... you feel this nagging desire to do more. You decide to analyze the search results to gauge the popularity of Perl. The criteria will be to find the first 1,000 search matches, find the users that posted them, and measure these users' popularity. Popularity will be half the number of friends plus twice the number of followers (though the formula can certainly be adjusted depending on the goal). The disproportionate weight is because it's much easier to follow others than to be followed on Twitter.

To do this, use Net::Twitter::friends_ids() and Net::Twitter::followers_ids(). The hook for measuring popularity will be in the rather boring function print_post(), to make it a bit more interesting. The new version will have a scoped hash, meaning only the function can access the hash directly and it is persistent through every call, so it will be a good cache). This cache will allow popularity to be stored so those expensive methods don't have to be called repeatedly.

Listing 6 shows the new version of print_post(). The global options also had to be adjusted to allow a --popularity switch and the usage text was adjusted (Listing 1 has that updated text).

Listing 6. Easy analyzing of Twitter popularity by search term
{
 # this hash is scoped to the print_post function only
 my %popularity;			

 sub print_post
 {
  my $t = shift @_;
  
  printf("%s (on %s)\n\t%s\n", $t->{from_user}, $t->{created_at}, $t->{text});
  
  if ($opts{verbose})
  {
   foreach my $key (sort keys %$t)
   {
    my $v = $t->{$key};
    $v = '(UNDEFINED)' unless defined $v;
    print "...$key=$v\n";
   }
  }

  if (exists $opts{popularity})
  {
   my $user = $t->{from_user};
   unless (exists $popularity{$user})
   {
    $popularity{$user} = scalar @{$handle->friends_ids($user)}/2 + 
    		         2* scalar @{$handle->followers_ids($user)};
   }

   print "\n\tPOPULARITY for $user = $popularity{$user}\n";

   my $sum = 0;
   $sum += $_ foreach values %popularity;
   
   printf "\tAVERAGE POPULARITY = %.2f\n", $sum / scalar keys %popularity;
  }
 }
}

The average is calculated and printed each time for simplicity, but it's really not hard to move that out of the loop. Run this with ./twitter_do.pl -n 10 -y perl and step up to 1,000 when you're sure you need to.


Conclusion

You've examined posting and searching with Twitter. A full working example of each function was demonstrated, aggregated inside a single script to simplify handling of options and other wrappers.

Finally, you saw how the Twitter connectivity of a single user can be quantified by using their number of friends and followers.

There are many possible ways to use Twitter. I hope this article gave you some practical ideas and suggestions, illustrated by working Perl code. Do check the Net::Twitter documentation and the Twitter API wiki to find out what else you can do with this service.


Download

DescriptionNameSize
Sample scripttwitter_do.zip2KB

Resources

Learn

Get products and technologies

Discuss

  • Get involved in the My developerWorks community; with your personal profile and custom home page, you can tailor developerWorks to your interests and interact with other developerWorks users.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source, Web development
ArticleID=454561
ArticleTitle=Cultured Perl: Practical Twitter with Perl
publish-date=12082009