Update Twitter and FriendFeed from the Linux command line

Keep your friends up to date through the magic known as GNU Wget and cURL

Learn how to use GNU Wget and cURL to send status updates to Twitter and FriendFeed without the use of a Twitter desktop application, and follow feeds from both Twitter and FriendFeed right from the Linux® command line. This article was updated on 31 Oct 2008 to correct a coding error in thewgetcommand under "Adding a tweet using GNU Wget and cURL." --Ed.

Share:

Marco Kotrotsos, Founder and developer, Incredicorp

author photoMarco Kotrotsos is a developer with 10 years of experience building software systems ranging from enterprise-class applications for top insurance companies to administrative tools for SMBs and Web applications for startups. Marco is the founder of Incredicorp, which focuses on helping startups and small businesses get their product to market. He collaborates with technical experts on leading-edge topics such as semantic Web, AI, CSS3, and semantic search.



31 October 2008 (First published 28 October 2008)

Also available in Russian Japanese

The reason people go for an operating system like Linux is the sum of its parts—its total usefulness. It is stable, affordable, fast, and runs on all kinds of hardware. It is also extremely flexible right out of the box, largely because of its powerful command-line interface (CLI), or shell.

This article puts two such tools—GNU Wget and cURL—in the spotlight. You learn how to use these two tools to send status updates to the social networking site Twitter without the use of a Twitter desktop application, and how to follow feeds from both Twitter and FriendFeed right from the command line.

Need API details? This article does not delve into the specifics of API use. Both Twitter and FriendFeed have such an API, which is easily accessible through a Representational State Transfer (REST)-ful interface.

The history of GNU Wget

GNU Wget is a flexible piece of software that retrieves data (such as files, mp3s, and images) from servers. Its non-interactive, robust, and recursive nature makes it extremely versatile, and it's mostly used in crawling Web sites for content or offline reading of HTML files. (Links in an HTML page are adjusted automatically to support this functionality.)

For example, to retrieve the page found at a particular URL, use this command:

wget http://wikipedia.org/

This command downloads the Wikipedia home page found at that URL onto your computer with the file name index.html, because that's the page GNU Wget found. The tool doesn't follow any links found on that page, but it's easy enough to have it do so:

wget –r http://wikipedia.org/

Name that tool

GNU Wget was developed by Hrvoje Nikšić from the program Geturl, which he also developed. Nikšić changed the name of his tool to Wget to differentiate it from an Amiga tool by the name of GetURL, which did the same thing and was written in Amiga REXX.

In this command, the -r switch tells GNU Wget to recursively follow all links found on that page, so the tool will crawl the entire site. You wouldn't want to use this switch for a site like Wikipedia, however, because you could end up downloading their entire database for easy local access, and that could take a very long time depending on available bandwidth. But you get the point.


The history of cURL

Client URL (cURL) fills a different niche from GNU Wget: It was designed primarily to feed currency exchange rates into Internet Relay Chat (IRC) environments. cURL is a power tool for performing URL manipulations and for transferring files with URL syntax, which means that you can transfer most types of files over HTTP, HTTPS, FTP, FTPS, and most other protocols.

The cURL application is most used for Web scraping and automating Web site interactions such as form submissions (either using GET or POST commands). For example, the command:

curl http://wikipedia.org

outputs the result of the request to your terminal window. In essence, cURL does the same in this case as your browser, only your browser renders the result, and cURL just spits out whatever it has found, which in many cases is HTML but can be anything.

Note: To see the request that cURL makes, add the -v switch (for verbose output), which makes the request but also returns any HTTP request that cURL makes to fetch the result.

With that background out of the way, let's move on to more ambitious tasks.


Adding a tweet using GNU Wget and cURL

Twitter is a social networking and micro-blogging service that allows you answer the question, "What are you doing?" by sending short text messages (140 characters in length), called tweets, to your friends, or followers. To help you better understand the power of GNU Wget and cURL, let's start by using them to add tweets to the Twitter timeline. There are a couple of ways to add tweets: You could use either the Web site or a client application such as GtkTwitter, Spaz, or twhirl, which is actually an Adobe® Air application.

You can script your own full-fledged Twitter client, which in turn would make it possible to automate tasks like twittering your current system usage or availability (for example, with a message such as "server@servername is currently experiencing heavy load"). You could also script an automated notification system. The possibilities are endless.

To see how this technology works, from the command line, type:

wget --keep-session-cookies --http-user=youremail --http-password=yourpassw \
    --post-data="status=hello from the linux commandline" \
    http://twitter.com:80/statuses/update.xml

This code might look a bit daunting if you haven't used the command-line interface much. But don't worry: It's actually logical in format. Let's look at the elements of the command:

  • wget runs the GNU Wget application.
  • --keep-session-cookies saves the session cookies instead of keeping them in memory, which is useful on sites that require access to other pages.
  • --http-user represents your user name.
  • --http-password is your password.
  • --post-data is the data you send to Twitter on which you will perform an action.
  • status= tells you that this is a status update.

You can perform the same task using cURL. To do so, type:

curl -u youremail:yourpassw -d status=”text” http://twitter.com/statuses/update.xml

This command does basically the same thing as the previous wget command but with a slightly different and friendlier syntax. The difference between the two applications in this case is the way they behave by default.

Doing things the way I describe here using GNU Wget forces the download of a file called update.xml to your local machine. This download can be useful, but it's hardly necessary. In contrast, cURL sends the resulting output to standard output (stdout).

Finding the Twitter public timeline

Before you can access the Twitter public timeline, you must find it. In other words, you must find the endpoint you'll be using to access the public feed on Twitter. (See Resources later in this article for links to information about the Twitter API.) The most common and easiest-to-use endpoint is the public timeline, which you can access from http://twitter.com/statuses/public_timeline.rss. The endpoint for the FriendFeed public timeline resides in the Google code repository (again, see Resources below for a link).

The FriendFeed API takes simple GET and POST requests. For simplicity, you'll work with the public endpoint, as well, which is available at http://friendfeed.com/api/feed/public?format=xml. You'll work with the XML later.

Accessing the Twitter public timeline

So, now that you have the Twitter public timeline endpoint, how do you access it?

Type the following address in your browser, or better yet, use curl from the command line:

curl http://twitter.com/statuses/public_timeline.rss

Now, you might have noticed from the result and from the way the endpoint is built up that you are looking at RSS-formatted output. By peering into the API documentation, you can see that other formats are available as well. By changing the file name extension to either .xml or .json, you can change the output format.

Using the grep command, you can filter the result and retrieve just the parameters you want:

curl http://twitter.com/statuses/public_timeline.xml | grep 'text'

Examine the output: You need whatever is between the <text> tags. However, if you want to get rid of the tags surrounding the tweets, you can use the sed command. (Details on the sed command are beyond the scope of this article, but for more information about this amazing tool, see Resources.)

curl http://twitter.com/statuses/public_timeline.xml | sed -ne '/<text/s<\/*text>//gp'

Now, to get rid of the progress meter, which adds unnecessary information to the timeline, add the -s switch:

curl -s http://twitter.com/statuses/public_timeline.xml | sed -ne '/<text/s<\/*text>//gp'

Finding the FriendFeed public timeline

You've used cURL to get the public timeline for Twitter. Now, you want to do the same thing for FriendFeed. In this case, the FriendFeed API endpoint for the public feed is http://friendfeed.com/api/feed/public?format=xml. However, following the public feed for FriendFeed is like following water drops in a river, so narrow the scope a bit to just your friends' feeds.

Look again at the API documentation. It takes a bit of searching, but you're are looking for the home feed, which is http://friendfeed.com/api/feed/home. Of course, you must authenticate this feed, and you need to sign on before feed/home knows who you are. Luckily, cURL makes this process easy with the authentication option:

username:password

But you don't use your user name and password in FriendFeed. Instead, the site requires a nickname and authentication remote key. So, you must go to the FriendFeed site at http://friendfeed.com/account/api and get them. After going to that URL, log in and get your nickname and remote key.

With your nickname and remote key pair, issue the command:

curl -u "nickname:key" http://friendfeed.com/api/feed/home

where nickname:key is your nickname and key.

This command returns your current FriendFeed in JavaScript Object Notation (JSON). To get XML, you must add the format parameter. Because this is a get request, you can just add it to the end of the URL:

curl -u "nickname:key" http://friendfeed.com/api/feed/home?format=xml

Nice, right?


Parsing the output

So, from parsing the Twitter feed, you know that you need to pipe it through sed first to get a real, legible result. XML is read easily enough, but after examining the result, you conclude that you need to parse everything between tags. However, there's a snag. The XML doesn't contain any new line or CR codes, so it's just one big long string of XML.

How would you parse it, then? You must choose a different output format. The available formats are JSON, XML , RSS, or Atom. For this example, go for RSS, because it's the cleanest and contains the line feeds you need.

Examine the result from the RSS feed. You know that you need whatever is between tags, so pipe the output through a modified sed command:

curl -s -u "nickname:key" http://friendfeed.com/api/feed/home?format=rss | 
    sed -ne '/<ff:body/s/<\/*ff:body>//gp'

There you have it! All the entries from your FriendFeed.


Putting it together

Running the commands from the command line by hand is not really the way to follow the feeds.

After all, you can do that by pressing the F5 key on the sites themselves. So, to keep it as close to the command line as possible, script it using shell script. You could of course use Python, Perl, or any of the scripting languages available on the platform, but running things from the command line provides a fitting end to the example.

You script the Twitter stream by creating a script aptly named lintweet. Of course, you're free to use whatever name you choose. Listing 1 shows this script.

Listing 1. Lintweet.sh
!/bin/bash
while :
do
curl -s http://twitter.com/statuses/public_timeline.xml | sed -ne '/<text/s<\/*text>//gp'
sleep 10
done
exit
Next, make this script executable. Then, run it using the command:
./lintweet

Every 10 seconds, the window is updated with the latest tweets. Because, in the case of Twitter, the terms of service (TOS) don't limit the rate at which the public feed is hit, you could update this setting every second by setting sleep to 1. But you should always be nice to servers, so leave it set at 10. (There really isn't much you can follow if you were to set sleep to 1 anyway, because the result would be a fast-flowing river of updates.)


Where to go from here

Now you know how to use two tools available on most Linux distributions—cURL and GNU Wget—to retrieve tweets from the Linux command line. You can also follow feeds from Twitter and FriendFeed manually and by using a simple shell script.

You can extend the shell script by filtering for certain keywords, so that it shows only the status updates that have certain words or phrases in them. Or, you can save the script in a file for easy retrieval of archived Twitter and FriendFeed updates. You can even automate Twitter updates by hooking your script up to a notification system like Growl, if you're running Mac OS X (see Resources). The possibilities are endless.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Linux on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=348430
ArticleTitle=Update Twitter and FriendFeed from the Linux command line
publish-date=10312008