Real Web 2.0: Open, geographic information systems at Geonames.org

Discover a free and open site with resources for working with locations and place names

One of the best sources for geographical information for users and developers is a shining example of the power of open data. GeoNames is a database, Web service, and destination site for all things geographical. It has a rich, RESTful API and offers Semantic Web features using Linking Open Data conventions. Learn how to use GeoNames, as a user and as a developer.

Uche Ogbuji, Partner, Zepheira, LLC

Uche OgbujiUche Ogbuji is Partner at Zepheira, LLC, a solutions firm specializing in the next generation of Web technologies. Mr. Ogbuji is lead developer of 4Suite, an open source platform for XML, RDF and knowledge-management applications and lead developer of the Versa RDF query language. He is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can find more about Mr. Ogbuji at his Weblog Copia.



30 September 2008

Also available in Chinese

The most wonderful thing about the Web is how often you run into some resource, and a few weeks later wonder how you ever managed without it. The most wonderful thing about the open data aspect of Web 2.0, the focus of this column, is that sometimes such resources include all the data you need to create your own little magic corners of the Web. GeoNames is one of those sites and services that is not just indispensable in its own right, but is also an important ingredient in other indispensable services. It's a Web site built around a well-designed, freely accessible database of geographical information. The home page describes it best:

The GeoNames geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.5 million unique features whereof 2.2 million populated places and 1.8 million alternate names. All features are categorized into one out of nine feature classes and further subcategorized into one out of 645 feature codes.[...] The data is accessible free of charge through a number of webservices and a daily database export. GeoNames is already serving up to over 11 million web service requests per day.[...] GeoNames is integrating geographical data such as names of places in various languages, elevation, population and others from various sources.

In this article I'll show you the main features of GeoNames for users as well as developers.

The GeoNames facade

In this column I refer to the main user page of a site as its façade, and the interface for open data as the foundation. The GeoNames façade is a useful tool for pinpointing a location. Suppose you're reading a novel and read of a small town in Colorado named "Superior". You can just go to the site and enter "Superior, Colorado" into the main search box. You should end up with the display in Figure 1.

Figure 1. GeoNames search results for "Superior, Colorado"
GeoNames search results

Looking at the top result, which, as with search engines, is often a very good guess of what you're looking for, you might click on the local name "Superior", or you might click on the map push-pin icon to the left. You get different results for each one. The local name goes to a Google-maps-powered information page for the centroid latitude and longitude of that place, in this case http://www.geonames.org/maps/google_39.953_-105.169.html (a nice, clean, hackable URL, which is one of the strengths of GeoNames). It's pretty much the southeastern corner of Boulder County, and the map includes multiple features: a settlement (the town of Superior itself), geographical features such as hills and lakes, some notable buildings, and other landmarks (even the local post office). Figure 2 is from a snapshot of the page.

Figure 2. Latitude/Longitude area page in GeoNames
Latitude/Longitude area page

Notice that the settlement icon is the same "P" push-pin (teardrop shape) that was associated with the top search result (Figure 1). If you go back to that search results page and click that icon, you get a more specific map zoomed into that particular feature, indicated by the different URL form, http://www.geonames.org/5440838/superior.html. Figure 3 is from a snapshot of this page.

Figure 3. Specific feature page in GeoNames
Specific feature page

The foundation

You can see how useful GeoNames is for quickly getting a sense of geographical context from any starting point, but what makes the service special is its open data foundation. You can download the full database dump text for the site, and you can use it quite freely thanks to its Creative Commons attribution license, which requires you merely to acknowledge and link back to GeoNames if you use the service. You are free to use the service in commercial applications on the Web or elsewhere.

You can also use GeoNames through its comprehensive Web services. It is very RESTfully designed, so you can use any code that can request data from a URL. There are numerous types of queries you can perform on GeoNames. The following is just a sample:

  • Find places near a postal code, by country (returning an XML file or JSON feed).
  • Find the postal codes near a given latitude/longitude (returning an XML file).
  • Find the "children" of a given geographical feature (for example, the provinces within a country, or the settlements within a province, returning an XML file or JSON feed).
  • Find geocoded Wikipedia articles near a given latitude/longitude, postal code, or place name (returning an XML file or JSON feed).
  • Find all neighbors of a country (returning an XML file or JSON feed).
  • Find the weather stations and their most recent weather observations within a bounding box of four latitude/longitude pairs (returning an XML file).
  • Get the time zone at a given latitude/longitude (returning an XML file or JSON feed).
  • Get the elevation in meters for a latitude/longitude representing a land area (returning an XML file or JSON feed).

Real Web 3.0

Not only is GeoNames one of the most useful support sites for Web 2.0, but it's also an anchor of what some are calling "Web 3.0"—Semantic Web. In an earlier installment of this column (see Resources), I discussed Linking Open Data (LOD), which is a practical initiative for enabling the Semantic Web. GeoNames is a key part of LOD, thanks to its support of RDF metadata for places (in fact, it supports a very detailed ontology to try to establish very clear context for everything). In the earlier discussion I discussed using HTTP 303 codes to give URIs to non-computable resources such as people and abstract qualities. GeoNames uses this approach to give places URIs suitable for the Semantic Web. Take the above-mentioned place URI for Superior, Colorado—http://www.geonames.org/5440838/superior.html. Replace "www" with "sws" and remove the file part and you have the GeoNames Semantic Web URI http://sws.geonames.org/5440838/. If you access it with an RDF-aware agent, you get a 303 redirect as in Listing 1:

Listing 1. Session demonstrating a Semantic Web request to GeoNames
    $ curl -I -H "Accept: application/rdf+xml"
    http://sws.geonames.org/5440838/HTTP/1.1 303 See Other
Date: Mon, 07 Jul 2008 05:23:47 GMT
Server: Apache/2.2.4 (Linux/SUSE)
Location: http://sws.geonames.org/5440838/about.rdf
Vary: Accept-Encoding
Content-Type: text/html; charset=iso-8859-1

cURL is a command-line tool for making HTTP requests and can be configured for almost every variety of request. I signal a Semantic Web agent to the server with the header setting -H "Accept: application/rdf+xml", which means the user agent expects an RDF response. I also use the -I option to ask it to use an HTTP HEAD request, getting just the response headers. You can see the 303 response code, and a Semantic Web agent can follow the Location: http://sws.geonames.org/5440838/about.rdf response header to get more information about the place, as in Listing 2.

Listing 2. cURL request for RDF data about a GeoNames place
    $ curl -H "Accept: application/rdf+xml" http://sws.geonames.org/5440838/about.rdf
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
<rdf:RDF
    xmlns="http://www.geonames.org/ontology#"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#">
<Feature rdf:about="http://sws.geonames.org/5440838/">
<name>Superior</name>
<featureClass rdf:resource="http://www.geonames.org/ontology#P"/>
<featureCode rdf:resource="http://www.geonames.org/ontology#P.PPL"/>
<inCountry rdf:resource="http://www.geonames.org/countries/#US"/>
<population>11186</population>
<wgs84_pos:lat>39.9527634</wgs84_pos:lat>
<wgs84_pos:long>-105.1685977</wgs84_pos:long>
<parentFeature rdf:resource="http://sws.geonames.org/5574999/"/>
<nearbyFeatures rdf:resource="http://sws.geonames.org/5440838/nearby.rdf"/>
<locationMap>http://www.geonames.org/5440838/superior.html</locationMap>
<wikipediaArticle rdf:resource="http://en.wikipedia.org/wiki/Superior%2C_Colorado"/>
<owl:sameAs rdf:resource="http://dbpedia.org/resource/Superior%2C_Colorado"/>
<wikipediaArticle rdf:resource="http://de.wikipedia.org/wiki/Superior_%28Colorado%29"/>
<wikipediaArticle rdf:resource="http://nl.wikipedia.org/wiki/Superior_%28Colorado%29"/>
<wikipediaArticle rdf:resource="http://pt.wikipedia.org/wiki/Superior_%28Colorado%29"/>
<wikipediaArticle rdf:resource="http://vo.wikipedia.org/wiki/Superior_%28Colorado%29"/>
</Feature>
</rdf:RDF>

Notice the absence of the -I option, so I get the response body. You can use the -i option if you want both headers and body. The resulting RDF document includes a lot of useful, general information about the resource, all provided according to the GeoNames ontology. In addition to the name property, which provides the prevalent, local name, you will sometimes also get alternateName, with other versions of the name, including versions in other languages. The latitude and longitude are given according to W3C "Basic Geo (WGS84 lat/long) Vocabulary". The parent feature http://sws.geonames.org/5574999/ refers to Boulder County. The locationMap property leads back to the user-friendly map display I covered in the first section. owl:sameAs leads back to an abstract resource on DBPedia, another major LOD site which provides Semantic Web versions of Wikipedia articles. You can also see direct links to Wikipedia, including links in languages other than English.

Notice that if you omit the Accept header when accessing sws.geonames.org, or if you use the typical browser setting of Accept: */*, you don't get 303 redirects to RDF, but rather a 301 (permanent) redirect to the user map view. The idea is to avoid confusing clients that might not be aware of Semantic Web conventions.


Wrap up

The uses for GeoNames are almost endless, thanks to the combination of rich geographical information on a richly available data platform. You don't have to worry about every detail of that platform, since there are also so many client libraries for GeoNames in a variety of languages, including Java, Python, Ruby, and Perl. In general there is a rapidly growing field of tools, libraries, plug-ins, and more, as you would expect from such a useful service. If you do take advantage of the great stuff at GeoNames, consider making a donation, or even volunteering as an ambassador. Recently I noticed they were missing an ambassador for Nigeria, and I volunteered.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Web development on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=342592
ArticleTitle=Real Web 2.0: Open, geographic information systems at Geonames.org
publish-date=09302008