Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Using the Technorati API

Create automated blog searches

Brian M. Carey, Information Systems Consultant, Triangle Information Solutions
Photo of Brian Carey
Brian Carey is an information systems consultant specializing in Java, Java Enterprise, PHP, Ajax, and related technologies. You can follow Brian Carey on Twitter at http://twitter.com/brianmcarey.

Summary:  Technorati is a blog cataloging service that enables users to search virtually the entire blogosphere for articles of interest. Like most entries in the Web 2.0 domain, Technorati provides an API to automate much of its functionality. Also like most entries in the Web 2.0 domain, that API is provided as a REST service. In this article, work with examples and learn to get the most out of the Technorati API.

Date:  08 Sep 2009
Level:  Intermediate PDF:  A4 and Letter (33KB | 9 pages)Get Adobe® Reader®
Also available in:   Chinese  Russian  Japanese

Activity:  20342 views
Comments:  

What is Technorati?

Before explaining the buzzword in the title of this article, it's worth explaining another buzzword: blogosphere.

The term blogosphere is used by journalists and computer geeks alike. It refers to a specific subset of Web pages in which owners of those Web pages (hereafter, bloggers) express their ideas, thoughts, passions, random musings, and links to other Web pages. The root word, blog, is a concatenation of the expression "Web log."

Handling search criteria

Frequently used acronyms

  • API: Application program interface
  • HTTP: Hypertext Transfer Protocol
  • REST: Representational State Transfer
  • URL: Uniform Resource Locator
  • XML: Extensible Markup Language

Some Web sites actually enable people who are not tech savvy to host their own, albeit modest, blogs. WordPress enables bloggers who are not software developers by trade to create some reasonably sophisticated blogs through the use of widgets, themes, and templates. The result has been an explosion of bloggers and blogging in general. As of last year, blogherald.com reported close to 200 million blogs worldwide. As of this writing, the blogosphere is the primary source of information about news and events that occur in many countries.

With so much unique information contained in the blogosphere, one is tempted to ask: Where is this information is cataloged, tracked, tagged, and available for search?

Enter Technorati. In its own words: "Technorati collects, organizes, and distributes the global online conversation." You can think of it as a Google specifically for the blogosphere. Or, as Time magazine puts it: "If Google is the Web's reference library, Technorati is becoming its coffee house." (See Resources for links to Technorati Media and the Time article.)

You can visit Technorati at http://technorati.com. You'll see a pretty search bar in shamrock-green at the top that reads Search the blogosphere... inside. Click in that box and type Obama. Then click the magnifying glass next to it. You'll quickly see the featured blog articles that discuss the President of the United States.


The Technorati API

Feel free to search the blogosphere all you want using the Technorati Web page. However, as a Web application developer, you might want to automate that search or enable your Web page visitors to view information retrieved from the blogosphere based on their own search criteria.

To make that happen, use the Technorati API. Like many APIs on the Internet, the Technorati API uses REST.

What Is REST?

REST is an acronym for Representational State Transfer. The full explanation of everything entailed in a proper REST definition is outside of the scope of this article; however, it is available elsewhere on IBM developerWorks (see the links provided in Resources). For the subject covered here, it is sufficient to state that REST enables developers to access information and resources using a simple HTTP invocation.

Think of REST this way: To obtain domain-specific data, you simply point a URL to a specific location. For the purposes of this article, that's really all it is. You can also think of it as a simplified Web service, but if you say that too loudly around the wrong people, you might find yourself in the middle of a debate.

In reference to the subject at hand, the Technorati API is a REST service that enables users to point to a specific URL and retrieve a variety of articles from the blogosphere that meet the criteria specified in the URL. This enables you, as a developer, to accept input within a Web application and dynamically query the blogosphere based on that input using a simple URL that encodes the input into a format the API understands.

Getting started: A simple example

Consider the example in Listing 1:


Listing 1. A simple search

http://api.technorati.com/search?key=xxxx&query=Obama

This is a fairly simple URL with only two request parameters.

Note that the actual Technorati API function is the word that follows the final slash (search). This indicates, unsurprisingly, that this REST invocation will perform a search against the blogosphere.

The first parameter is the key. The actual key used varies from user to user and is not really the xxxx character string. To obtain the key that you will use, you need to register with Technorati and request a key. Fortunately this is easy and free. Unfortunately, this means that you cannot simply copy and paste the URLs from this article into a browser and see the results. You have to substitute your own key for this xxxx string.

The second request parameter is the actual query. Just like in the manual example, the search uses the keyword Obama.

After you substitute your own key for the xxxx string, you can then plug that URL into a Web browser and see what results are returned. Your results will vary depending on your Web browser brand and version. Whatever the results on the screen, it's best to right click on the page and select View Source to view the actual XML that is returned.

While the actual contents will also vary based on when your query is executed, the results should resemble Listing 2.


Listing 2. Output from a simple search (partial output)

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0 /search" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN" 
	"http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
<result>
    <query>Obama</query>
    <querycount>2270581</querycount>
    <rankingstart></rankingstart>
</result>
<item>
   <weblog>
      <name>Critica Pura</name>
      <url>http://criticapura.com</url>
      <rssurl>http://criticapura.com/feed/</rssurl>
      <atomurl></atomurl>
      <inboundblogs>7</inboundblogs>
      <inboundlinks>10</inboundlinks>
      <lastupdate>2009-06-21 17:13:23 GMT</lastupdate>
   </weblog>
   <title>Jib Jab Obama</title>
   <excerpt>Try JibJab Sendables</excerpt>
   <created>2009-06-21 17:13:23 GMT</created>
   <permalink>http://criticapura.com/2009/06/jib-jab-obama/</permalink>
</item>
...

Interestingly enough, the first query result as of this writing is a foreign language blog entry (at least, foreign to those who speak English).

The result element provides metadata information about the query results. The query child provides the actual query keyword. The querycount child provides the number of articles from the blogosphere that matched the query.

Many item elements follow the result element. Each item element corresponds to a blog article that matched the search criteria.

The weblog element provides information about the blog itself. This is information about the entire blog as opposed to just the article that matched the criteria. Table 1 describes the weblog child elements.


Table 1. weblog child elements
ElementDescription
nameActual name of the blog itself
urlURL of the blog
rssurlURL of the Really Simple Syndication (RSS) feed for that blog
atomurlURL of the Atom feed for that blog
inboundblogsNumber of blogs that link to that blog
inboundlinksNumber of external sites that link back to that blog
lastupdateDate and time the blog was last updated

The elements described in Table 2 are children of item as opposed to weblog. These children refer to the article itself.


Table 2. item child elements
ElementDescription
titleActual title of the blog article
excerptSynopsis of the blog article
createdDate and time the article was written
permalinkURL for the blog article

Basic Technorati API functions

Although the Technorati API provides a powerful search function, it's worth noting that the API also provides other functions you might find useful as well.

The cosmos function is not at all intuitively named. It allows you to search for blogs linking to a base URL. Suppose, for example, you want to find all blogs that link back to a blog article found on the following URL: http://nicole-rensmann.bookola.de/blog. For that, you would invoke the following REST invocation: http://api.technorati.com/cosmos?key=xxxx&url=http://nicole-rensmann.bookola.de/blog. If you plug that into your browser (allowing for the usual substitution with the key), you should get something similar to Listing 3.


Listing 3. Output from a cosmos function (abbreviated)

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN" 
  "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
  <result>
    <url>http://nicole-rensmann.bookola.de/blog</url>
    <weblog>
        <name>Nicole Rensmanns kleine Welt</name>
        <url>http://nicole-rensmann.bookola.de/blog</url>
        <rssurl>http://nicole-rensmann.bookola.de/blog/?feed=rss2</rssurl> 
        <atomurl>http://nicole-rensmann.bookola.de/blog/?feed=atom</atomurl> 
        <inboundblogs>6</inboundblogs>
        <inboundlinks>6</inboundlinks>
        <lastupdate>2009-06-21 17:10:52 GMT</lastupdate>
        <rank>575630</rank>
    </weblog>
    <inboundlinks>7</inboundlinks>
    <rankingstart>1</rankingstart>
  </result>
  <item>
        <weblog>
            <name>Das Datenschutz-Blog</name>
            <url>http://www.datenschutzbeauftragter-online.de</url>
            <rssurl>http://www.datenschutzbeauftragter-online.de/feed/</rssurl>
            <atomurl>http://www.datenschutzbeauftragter-online.de/feed/atom/</atomurl>
            <inboundblogs>83</inboundblogs>
            <inboundlinks>343</inboundlinks>
            <lastupdate>2009-06-20 07:22:20 GMT</lastupdate>
        </weblog>
        <nearestpermalink>http://www.datenschutzbeauftragter-online.de</nearestpermalink>
        <title>Uberblick zum Thema Netzsperren</title>
        <excerpt>der Ursula von der Leyen Sachliche Debatte uber das Thema</excerpt>
        <linkcreated>2009-05-11 04:20:01 GMT</linkcreated>
        <linkurl>http://nicole-rensmann.bookola.de/blog/?p=3293</linkurl>
  </item>
...

The output in XML format looks strikingly similar to what you saw in Listing 2 with some notable exceptions. The weblog element here provides information about the blog with the inbound links. Note that the url child element directly corresponds to the url request parameter.

Once again, you see several item elements. Each one of these item elements contains information about the blog linking back to the blog you queried for.

The tag function allows you to search for blog articles with a particular tag. Technorati uses tags to categorize blog articles. Blog article authors are allowed to place one or more tags associating their articles with particular subject matter.

To search the blogosphere for articles about fishing, you use the following URL: http://api.technorati.com/tag?key=xxxx&tag=fishing. Again, you need to use your own API key in lieu of the xxxx in the URL. If you plug that into your browser, you should see something similar to Listing 4.


Listing 4. Output from a tag function (abbreviated)

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN" 
 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
 <result>
  <query>fishing</query>
  <postsmatched>43655</postsmatched>
  <blogsmatched></blogsmatched>
  <start>1</start>
  <limit>20</limit>
  <querytime>3.126</querytime>
 </result>
 <item>
  <weblog>
   <name>Travel and Leisure Articles</name>
   <url>http://www.toptravelarticles.com</url>
   <rssurl>http://www.toptravelarticles.com/feed</rssurl>
   <atomurl>http://www.toptravelarticles.com/feed/atom</atomurl>
   <inboundlinks>40</inboundlinks>
   <inboundblogs>19</inboundblogs>
   <lastupdate>2009-06-21 17:06:01</lastupdate>
   <hasphoto></hasphoto>
  </weblog>
  <title>Visiting Ghana?</title>
  <excerpt>If you want to experience the culture up close</excerpt>
  <created>2009-06-21 17:06:01</created>
  <postupdate>2009-06-21 17:06:01</postupdate>
  <permalink>http://www.toptravelarticles.com/visiting-ghana.html</permalink>
 </item>
...

Again, the output resembles what you saw before with other Technorati API functions. The basic difference is that, in this case, you see blog articles tagged with fishing.

One particularly interesting Technorati API function is toptags. The toptags function displays the most popular tags in use when the function is executed. Plug the following URL (making the usual key substitution) into your browser: http://api.technorati.com/toptags?key=xxxx. You will see something similar to Listing 5.


Listing 5. Output from a toptags function (abbreviated)

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0 /topptags" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN" 
 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
<result>
<limit>20</limit>
</result>
<item>
<tag>Weblog</tag>
<posts>9578863</posts>
</item>
<item>
<tag>Life</tag>
<posts>7355121</posts>
</item>
<item>
<tag>News</tag>
<posts>4638644</posts>
</item>
...

The output here is easy to parse. Each tag is listed, and the number of blog articles containing that tag is listed in the next element.

Conclusion

Technorati is a Web site that maintains information about articles published in the blogosphere. Using Technorati you can query for blog articles based on a specific set of criteria.

In compliance with unwritten rules of the information superhighway, Technorati also provides an API so people can programmatically search for blog articles based on a specific set of criteria. The API operates using a REST invocation.

Using the Technorati REST API enables Web application developers to automate blog searches. Developers can implement it so that their Web application users can search the blogosphere for articles that match their own specific interests.


Resources

Learn

Get products and technologies

Discuss

About the author

Photo of Brian Carey

Brian Carey is an information systems consultant specializing in Java, Java Enterprise, PHP, Ajax, and related technologies. You can follow Brian Carey on Twitter at http://twitter.com/brianmcarey.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, Open source, Web development
ArticleID=425125
ArticleTitle=Using the Technorati API
publish-date=09082009
author1-email=careyb@triangleinformationsolutions.com
author1-email-cc=dwxed@us.ibm.com

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers