The most wonderful thing about the Web is how often you run into some resource, and a few weeks later wonder how you ever managed without it. The most wonderful thing about the open data aspect of Web 2.0, the focus of this column, is that sometimes such resources include all the data you need to create your own little magic corners of the Web. GeoNames is one of those sites and services that is not just indispensable in its own right, but is also an important ingredient in other indispensable services. It's a Web site built around a well-designed, freely accessible database of geographical information. The home page describes it best:
The GeoNames geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.5 million unique features whereof 2.2 million populated places and 1.8 million alternate names. All features are categorized into one out of nine feature classes and further subcategorized into one out of 645 feature codes.[...] The data is accessible free of charge through a number of webservices and a daily database export. GeoNames is already serving up to over 11 million web service requests per day.[...] GeoNames is integrating geographical data such as names of places in various languages, elevation, population and others from various sources.
In this article I'll show you the main features of GeoNames for users as well as developers.
The GeoNames facade
In this column I refer to the main user page of a site as its faÃ§ade, and the interface for open data as the foundation. The GeoNames faÃ§ade is a useful tool for pinpointing a location. Suppose you're reading a novel and read of a small town in Colorado named "Superior". You can just go to the site and enter "Superior, Colorado" into the main search box. You should end up with the display in Figure 1.
Figure 1. GeoNames search results for "Superior, Colorado"
Looking at the top result, which, as with search engines, is often a very good
guess of what you're looking for, you might click on the local name "Superior", or you might click on the map push-pin icon to the left. You get different results for each one. The local name goes to a Google-maps-powered information page for the centroid latitude and longitude of that place, in this case
http://www.geonames.org/maps/google_39.953_-105.169.html (a nice, clean, hackable URL, which is one of the strengths of GeoNames). It's pretty much the southeastern corner of Boulder County, and the map includes multiple features: a settlement (the town of Superior itself), geographical features such as hills and lakes, some notable buildings, and other landmarks (even the local post office). Figure 2 is from a snapshot of the page.
Figure 2. Latitude/Longitude area page in GeoNames
Notice that the settlement icon is the same "P" push-pin (teardrop
shape) that was
associated with the top search result (Figure 1). If you go
back to that search results page and click that icon, you get a more specific map
zoomed into that particular feature, indicated by the different URL form,
http://www.geonames.org/5440838/superior.html. Figure 3 is from a snapshot of this page.
Figure 3. Specific feature page in GeoNames
You can see how useful GeoNames is for quickly getting a sense of geographical context from any starting point, but what makes the service special is its open data foundation. You can download the full database dump text for the site, and you can use it quite freely thanks to its Creative Commons attribution license, which requires you merely to acknowledge and link back to GeoNames if you use the service. You are free to use the service in commercial applications on the Web or elsewhere.
You can also use GeoNames through its comprehensive Web services. It is very RESTfully designed, so you can use any code that can request data from a URL. There are numerous types of queries you can perform on GeoNames. The following is just a sample:
- Find places near a postal code, by country (returning an XML file or JSON feed).
- Find the postal codes near a given latitude/longitude (returning an XML file).
- Find the "children" of a given geographical feature (for example, the provinces within a country, or the settlements within a province, returning an XML file or JSON feed).
- Find geocoded Wikipedia articles near a given latitude/longitude, postal code, or place name (returning an XML file or JSON feed).
- Find all neighbors of a country (returning an XML file or JSON feed).
- Find the weather stations and their most recent weather observations within a bounding box of four latitude/longitude pairs (returning an XML file).
- Get the time zone at a given latitude/longitude (returning an XML file or JSON feed).
- Get the elevation in meters for a latitude/longitude representing a land area (returning an XML file or JSON feed).
Real Web 3.0
Not only is GeoNames one of the most useful support sites for Web 2.0, but it's
also an anchor of what some are calling "Web 3.0"—Semantic Web. In
an earlier installment of this column (see Resources), I discussed Linking Open Data (LOD), which is
a practical initiative for enabling the Semantic Web. GeoNames is a key part of
LOD, thanks to its support of RDF metadata for places (in fact, it supports a very
detailed ontology to try to establish very clear context for everything). In the
earlier discussion I discussed using HTTP 303 codes to give URIs to non-computable
resources such as people and abstract qualities. GeoNames uses this approach to
give places URIs suitable for the Semantic Web. Take the above-mentioned place URI for Superior, Colorado—
http://www.geonames.org/5440838/superior.html. Replace "www" with "sws" and remove the file part and you have the GeoNames Semantic Web URI
http://sws.geonames.org/5440838/. If you access it with an RDF-aware agent, you get a 303 redirect as in Listing 1:
Listing 1. Session demonstrating a Semantic Web request to GeoNames
$ curl -I -H "Accept: application/rdf+xml" http://sws.geonames.org/5440838/HTTP/1.1 303 See Other Date: Mon, 07 Jul 2008 05:23:47 GMT Server: Apache/2.2.4 (Linux/SUSE) Location: http://sws.geonames.org/5440838/about.rdf Vary: Accept-Encoding Content-Type: text/html; charset=iso-8859-1
cURL is a command-line tool for making HTTP requests and can be configured for
almost every variety of request.
I signal a Semantic Web agent to the server with the header setting
"Accept: application/rdf+xml", which means the user agent expects an RDF
I also use the
-I option to ask it to use an HTTP HEAD request, getting just the response headers. You can see the 303 response code, and a Semantic Web agent can follow the
Location: http://sws.geonames.org/5440838/about.rdf response header to get more information about the place, as in Listing 2.
Listing 2. cURL request for RDF data about a GeoNames place
$ curl -H "Accept: application/rdf+xml" http://sws.geonames.org/5440838/about.rdf <?xml version="1.0" encoding="UTF-8" standalone="no"?> <rdf:RDF xmlns="http://www.geonames.org/ontology#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wgs84_pos="http://www.w3.org/2003/01/geo/wgs84_pos#"> <Feature rdf:about="http://sws.geonames.org/5440838/"> <name>Superior</name> <featureClass rdf:resource="http://www.geonames.org/ontology#P"/> <featureCode rdf:resource="http://www.geonames.org/ontology#P.PPL"/> <inCountry rdf:resource="http://www.geonames.org/countries/#US"/> <population>11186</population> <wgs84_pos:lat>39.9527634</wgs84_pos:lat> <wgs84_pos:long>-105.1685977</wgs84_pos:long> <parentFeature rdf:resource="http://sws.geonames.org/5574999/"/> <nearbyFeatures rdf:resource="http://sws.geonames.org/5440838/nearby.rdf"/> <locationMap>http://www.geonames.org/5440838/superior.html</locationMap> <wikipediaArticle rdf:resource="http://en.wikipedia.org/wiki/Superior%2C_Colorado"/> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Superior%2C_Colorado"/> <wikipediaArticle rdf:resource="http://de.wikipedia.org/wiki/Superior_%28Colorado%29"/> <wikipediaArticle rdf:resource="http://nl.wikipedia.org/wiki/Superior_%28Colorado%29"/> <wikipediaArticle rdf:resource="http://pt.wikipedia.org/wiki/Superior_%28Colorado%29"/> <wikipediaArticle rdf:resource="http://vo.wikipedia.org/wiki/Superior_%28Colorado%29"/> </Feature> </rdf:RDF>
Notice the absence of the
-I option, so I get the
response body. You can use the
-i option if you want both headers and body. The resulting RDF document includes a lot of useful, general information about the resource, all provided according to the GeoNames ontology. In addition to the
name property, which provides the prevalent, local name, you will sometimes also get
alternateName, with other versions of the name, including versions in other languages. The latitude and longitude are given according to W3C "Basic Geo (WGS84 lat/long) Vocabulary". The parent feature
http://sws.geonames.org/5574999/ refers to Boulder County. The
locationMap property leads back to the user-friendly map display I covered in the first section.
owl:sameAs leads back to an abstract resource on DBPedia, another major LOD site which provides Semantic Web versions of Wikipedia articles. You can also see direct links to Wikipedia, including links in languages other than English.
Notice that if you omit the
Accept header when accessing
sws.geonames.org, or if you use the typical browser
Accept: */*, you don't get 303 redirects to RDF, but rather a 301 (permanent) redirect to the user map view. The idea is to avoid confusing clients that might not be aware of Semantic Web conventions.
The uses for GeoNames are almost endless, thanks to the combination of rich geographical information on a richly available data platform. You don't have to worry about every detail of that platform, since there are also so many client libraries for GeoNames in a variety of languages, including Java, Python, Ruby, and Perl. In general there is a rapidly growing field of tools, libraries, plug-ins, and more, as you would expect from such a useful service. If you do take advantage of the great stuff at GeoNames, consider making a donation, or even volunteering as an ambassador. Recently I noticed they were missing an ambassador for Nigeria, and I volunteered.
- Try out GeoNames through the main search page, and then learn more about the service, including opportunities to help. Check out the overall statistics maintained through the project.
- Learn more about the RESTful Web API and the Semantic Web facilities.
- The W3C's Basic Geo (WGS84 lat/long) Vocabulary is a useful convention for representing geographical information.
- Check out my earlier article on Linking Open Data (LOD) in which I introduced the techniques used by GeoNames for Semantic Web support.
- "Create and edit Web resources with the Atom Publishing Protocol", by James Snell, shows cURL examples, and serves as a brief guide to the tool for querying RESTful services.
- The article "Turn SQL into XML with PHP", by Vikram Vaswani, includes a query of GeoNames from PHP.
- Expand your site development skills with articles and tutorials that specialize in Web technologies in the developerWorks Web development zone, including previous installments of this column.
- Stay current with developerWorks technical events and webcasts.
Get products and technologies
- Get one of the many client libraries and plug-ins available for GeoNames.
- Use cURL to explore and test RESTful Web services.
- Participate in developerWorks blogs and get involved in the developerWorks community.
Dig deeper into Web development on developerWorks
Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.
Experiment with new directions in software development.
Software development in the cloud. Register today to create a project.
Evaluate IBM software and solutions, and transform challenges into opportunities.