Using the Twitter Search API
Create automated tweet searches
Twitter is a Web-based social media site that lets you communicate with followers through stories known as tweets through the Twitter GUI. Tweets are limited to a maximum of 140 characters, a limitation based on the state of mobile devices at the time Twitter was developed. But it is a welcome enforcement, as it prevents unnecessary spam and verbal clutter within a single tweet.
Now that you are familiar with Twitter, it's time to move to the next level by familiarizing yourself with Twitter Search.
As I said, Twitter is an online organization filled with tweets, or brief statements that users make to their followers. With that in mind, wouldn't it be great if you could find a bunch of tweets related to a specific subject?
Good news. With Twitter Search, you can. You can search by keywords, topic, author, language, and a variety of other criteria. Head on over to http://search.twitter.com to see this in action. Type a keyword that you want to search for (for example, Java™), and then click Search. Voilà! A series of tweets, newest to oldest, appears on the screen.
But how can you search by topic instead of by keyword? Keep in mind that
tweets specific to a particular topic contain the topic name preceded by
the hash symbol/pound sign (
#). For example, a Star
Trek enthusiast might tweet something about the new movie and
within that tweet include #startrek to let people know that this
particular tweet is about Star Trek.
To search for tweets by topic, simply include the topic name (including the
hash symbol/pound sign) in your keyword search. Following the previous
example, simply go to the Twitter Search page, type
#startrek, then click Search. You will see a
list of tweets specific to Star Trek.
The Twitter Search API
The Twitter Search API is great for manual searches as a user. But wouldn't it be great if you, as an outstanding software developer, could programmatically search for tweets based on topic or keyword?
More fabulous news: You can.
Like many other great Web applications, Twitter Search provides a REST API so that you can search for tweets in an automated fashion. Before delving too fully into the API, however, it is probably best to cover the REST concept first for those unfamiliar with it.
What is REST?
REST, for purposes of this article, enables developers to access information and resources using a simple HTTP invocation. Think of REST this way: You can obtain domain-specific data simply by pointing a URL to a specific location. You can also think of it as a simplified Web service, but if you say that too loudly around the wrong people, you might find yourself in the middle of a debate.
So, the Twitter Search API is a REST service that enables users to point to a specific URL and retrieve a variety of tweets that meet the criteria specified in the URL. This enables you, as a developer, to accept input within a Web application and dynamically query Twitter based on that input, using a simple URL that encodes the input into a format that the API understands.
Getting started: A simple example
Consider the example in Listing 1.
Listing 1. A simple example of a Twitter search
This query is very easy to parse. The domain is intuitive:
search.twitter.com. This is where the Search API resides.
After the first slash is the service that you are executing—in this
case, the word search. It might seem peculiar to even require the
word search here, as the word is already in the domain name, but
this is an attempt by the good folks at Twitter to keep things consistent
with the basic Twitter API, which uses numerous functions.
Next comes the only request parameter,
q, which is short for
query. And following that is the value of that
parameters—in this case,
To summarize, the URL in the code example above tells the Twitter Search API to search for all recent tweets containing the word java (case is insensitive) and return the results in Atom format.
Parsing the output
Now, point your browser to the address in Listing 1. The actual output returned to your screen will vary depending on which browser you use and which version it is. To keep things consistent when you view the source, right-click the screen, and then click View Source. You should see something similar to Listing 2.
Listing 2. Output from a simple search (partial output)
<?xml version="1.0" encoding="UTF-8"?> <feed xmlns:google="http://base.google.com/ns/1.0" xml:lang="en-US" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http://www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/"> <id>tag:search.twitter.com,2005:search/java</id> <link type="text/html" rel="alternate" href="http://search.twitter.com/search?q=java"/> <link type="application/atom+xml" rel="self" href="http://search.twitter.com/search.atom?q=java"/> <title>java - Twitter Search</title> <link type="application/opensearchdescription+xml" rel="search" href="http://search.twitter.com/opensearch.xml"/> <link type="application/atom+xml" rel="refresh" href="http://search.twitter.com/search.atom?q=java&since_id=1990561514"/> <twitter:warning>since_id removed for pagination.</twitter:warning> <updated>2009-06-01T12:11:26Z</updated> <openSearch:itemsPerPage>15</openSearch:itemsPerPage> <link type="application/atom+xml" rel="next" href="http://search.twitter.com/search.atom?max_id=1990561514&page=2&q=java"/> <entry> <id>tag:search.twitter.com,2005:1990561514</id> <published>2009-06-01T12:11:26Z</published> <link type="text/html" rel="alternate" href="http://twitter.com/GailR/statuses/1990561514"/> <title>D/L latest upgrade for Google's Chrome Browser & like it. Faster, esp w Java</title> <content type="html">D/L latest upgrade for Google's Chrome Browser & like it. Faster, esp w <b>Java</b></content> <updated>2009-06-01T12:11:26Z</updated> <twitter:source><a href="http://twitter.com/">web</a></twitter:source> <twitter:lang>en</twitter:lang> <author> <name>GailR (Gail R)</name> <uri>http://twitter.com/GailR</uri> </author> </entry> ... ]>
Note: Your output will look totally different in content but identical in structure. This because I ran my search at a completely different time than you are running yours, so the recent tweets for me will be different than the recent tweets for you. Recall that the default search sorts by tweets newest to oldest.
Here's a breakdown of the code:
- Note that the root element is
feed. This is standard according to the Atom specification (see Related topics for links to more information about Atom). The namespace that Twitter uses is
http://www.w3.org/2005/Atom, as specified as an attribute in the root element.
titleelement provides a synopsis of the query—useful if you are simply parsing the output but might not have been the one who created the query.
linkelements provide the URL for the query itself. You can plug those into your browser and get the same results.
entrystanza represents a tweet. Although for the sake of brevity only one is listed, in reality, there will be many of these in your output. Notice that title and content are the same in actual content. This is because tweets have no titles, so it makes sense that the title is the actual tweet itself.
Recall that Atom is designed for article-type documents, which usually have a headline, then a main body. Because that is not the case with tweets, the two elements contain identical content.
idelement is required by Atom and is a globally unique identifier (GUID) for this particular tweet. All tweets across the universe of Twitter will have unique IDs so they can be referenced individually.
updateddate and times are also identical. This makes sense, because the tweet was never updated.
- The first
linkelement provides a link to this single tweet. Paste http://twitter.com/GailR/statuses/1990561514 into a browser, and you'll see that same tweet.
sourceelement (specified as
twitter:sourcebecause it is in the
langelement (specified as
twitter:langbecause it is in the
authorstanza provides information about the Twitter user.
Crafting more complex searches
The example provided so far is fairly rudimentary: It is a search for one word only. However, the Twitter Search API provides a powerful set of criteria parameters and supports complex queries.
Suppose, for example, you want to search for tweets directed to a specific
user. In tweet language, users direct their tweets to a specific user by
prepending an at sign (
@) to the user's screen name (for
@johnqpublic). For searching purposes, you can
@ sign and simply search for a tweet directed
at a specific user. See Listing 3.
Listing 3. Searching for tweets directed to a specific user
%3A in the middle of the URL, just in front of the
user's name: that is the URL encoding for a colon (
to prefix, so you can read it as follows:
If you want to search for tweets from a particular user instead of
to a particular user, simply substitute the word
to in Listing 3.
If you want to search for a specific topic, you simply need to encode the
hash tag/pound sign for the URL. That code is
%23, so an API
#startrek would look like Listing
Listing 4. Searching for tweets by topic
Note that, like so many other search engines, you can use
OR within your Twitter searches. This is accomplished by
+OR+ between the query values,
respectively. For an example, see Listing 5, which
returns all recent tweets containing either
Listing 5. Searching for tweets containing either "#startrek" or "#americanidol"
You also have the option of specifying the
lang parameter so
that your query only returns results in a specific language. The value of
lang parameter must match one of the language codes
included in the ISO 639-1 specification. See an example in Listing 6.
Listing 6. Searching for Star Trek tweets in English
You can also restrict the search results based on date. Use the parameters
until to return tweets no older than a
certain date or no later than a certain date, respectively. An example is
found in Listing 7.
Listing 7. Searching for Star Trek tweets since 1 May 2009
Twitter is a social networking phenomenon that facilitates microblogging among interested parties. It has skyrocketed in popularity just over the past year as everyone from postal workers to celebrities find themselves tweeting on a regular basis.
In compliance with unwritten rules of the information superhighway, Twitter also provides a Search utility so that people can search for tweets based on a specific set of criteria. The search utility enables searches to be performed either manually using a Web page (the same way many people use Google, for example) or through a REST invocation.
Using the Twitter Search API enables Web application developers to automate Twitter searches. Developers can use it to display up-to-the-minute, content-specific tweets within their own (or their client's) Web applications. It is an outstanding utility for those interested in gleaning even more domain-specific information from the Internet.
- Search API documentation: Check out the Twitter Search API documentation.
- RESTful Web services: The basics (Alex Rodriguez, developerWorks, November 2008): Read an excellent overview of REST.
- The Atom specification: Check out Wikipedia's overview of the Atom specification.
- Request for Comments (RFC) 4827: Read the complete Atom specification.
- The Twitter site: Explore the Twitter service. Try it and be connected with friends, family, and co–workers as you exchange short messages about what you do.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- developerWorks podcasts: Listen to interesting interviews and discussions for software developers.