Skip to main content

Language support in Apache through negotiation

David Seager, CICS/390 Development, IBM Hursley
David Seager has been playing with Linux and Web-based applications for over 2 years.

Summary:  If you've ever flicked through an Apache's httpd.conf file, you might have noticed a few lines near the top reading AddLanguage de .de and AddLanguage fr .fr. In this article, David Seager explains what they are, what they do, and how you can use them.

Date:  01 Feb 2001
Level:  Introductory
Activity:  1599 views

Apache language capabilities

The latest versions of Apache (from version 1.2 onwards) support "content negotiation" in the HTTP specification -- where Web browsers can send a Web server information about the type of content their users would like to see in a Web page with the server responding appropriately. This information ranges from the format of an image the browser supports to which language the user would prefer to see. This is where the AddLanguage directives become relevant.

Apache has the capability of using this language preference to serve up different versions of a HTML page in whichever language the user desires with different page elements (such as a graphic of a national flag), links, or whatever a Web designer fancies. Of course, it is necessary to put a bit of effort on the server side to ensure all this happens.

Generally, a Web page on the server will be a file with an .html extension. Visitors to a site specify the file and it is served to them. Nothing new about that. However, what the AddLanguage directive does is specify that a specific-language version of such a file might be found with the extra extension specified. For example, AddLanguage fr .fr says that a French version of a file might exist with the .fr extension. Consequently, we might have an index.html.fr for French readers, an index.html.de for German readers, and so on. The keyword "fr" specifies French, "de" for German, and so on. See Resources for a link to the list of language keywords.

Making this work is fairly straightforward. The AddLanguage lines in httpd.conf are usually already there. If not, the format is as follows: the first parameter specifies the language keyword (as sent by a Web browser), and the second specifies what extension Apache should be looking for on a file for that particular language. Therefore, we choose files in French to have the .fr extension by writing:

AddLanguage fr .fr and similarly for German, English, Italian, etc:

AddLanguage it .it
AddLanguage de .de
AddLanguage en .en
AddLanguage da .da
AddLanguage el .el

In the directive for the directory in which we want this to work, we must add the option MultiViews explicitly. MultiViews basically tells Apache to go looking for those extra, language-specific files when a file is requested and must be specified (as its not included in the catchall Options All). For my root directory, I would code the following in httpd.conf:

<Directory /home/httpd/html>
Options MultiViews
</Directory>

The final step is to create the actual language-specific files. In order to conduct a quick test, I have created a few obvious files and put them into my root directory.

-----test.html.fr------
French
-----------------------
-----test.html.en------
English
-----------------------
-----test.html.de------
German
-----------------------

Using the MultiViews option, Apache will only try to match based on language if the requested file doesn't exist; if there was a test.html file, then that would always be served no matter what language your visitors specified!

On the client side, all that is needed is to specify the visitor's language preferences. In Netscape, this can be done with Edit--> Preferences, selecting the Navigator--> Languages tab, clicking the Add button and adding a language. Add "French [fr]", "English [en]," and "German [de]" and order them French, then German, and then English.

To test this, use the Web browser to ask for the document "test.html" in whatever directory you put it. As mine is in the root, I use http://myserver/test.html. Apache should then serve the French document. By swapping the order of the languages in Netscape, you should be able to get it to serve the German or English versions instead.

If a visitor doesn't express a preference or specifies a language you haven't seen before, or haven't prepared for, specifying a filename of test.html.(that is, with a trailing period but no language identifier), although it's not mentioned in the Apache documentation, seems to cause Apache to serve that file if it can't match the language. (I tested this with Apache 1.3.9, so your mileage may vary.)

More correctly, the content negotiation module does the work (mod_negotiation -- which is often compiled by default), so if you're using Dynamic Shared Object support, you may also spot a couple of lines in the httpd.conf to load the following module:

LoadModule negotiation_module modules/mod_negotiation.soAddModule mod_negotiation.c (these lines from the Linux version of Apache 1.3).

So how is this actually working? When you specify a language preference in your Web browser, it uses the designated language in a HTTP header it sends to the Web server whenever a request is made; this header is Accept_Language and will typically be the list of language keywords the user specifies (in descending order of preference), so our Web browser will be sending

Accept-Language: fr de en

When the MultiViews option is enabled and the requested file doesn't exist Apache will look for those language-specific files and serve up the appropriate one.


Type maps

If you want two different language files with totally different names, again selected on the basis of a user's preference, then this is where the other feature of mod_negotiation comes in: "type maps." These are text files that can explicitly specify a choice of different document names for a particular file name.

The type map file has a specific extension that allows Apache to identify it as such. In httpd.conf, the line

AddHandler type-map var

specifies that files with the extension .var will be treated as type maps. Obviously you're free to choose whatever extension you fancy. This line also enables type-map processing on the server.

A type map file specifies a number of files and is broken into a number of lines each with a keyword name, colon, some whitespace and a value, with blank lines seperating details of each file. The keywords we'll be using are "URI" specifying the file, "Content-Type" specifying the mime type of the file, and "Content-Language" specifying its language. For example, if we have two files, english_menu.html (in english) and french_menu.html (in french), in an example type-map file (which I'll call menu.var), I would code as follows:

Note: fr

------menu.var------
URI: english_menu.html
Content-Type: text/html
Content-Language: en
URI: french_menu.html
Content-Type: text/html
Content-Language: fr
--------------------

and just as an example:

---english_menu.html---
English menu
-----------------------
---french_menu.html----
French menu
-----------------------

Putting all these files into the root directory and asking for http://myserver/menu.var (for my server setup) will result in either a French or English menu, depending on which language was the first preference in the browser


Comparison

Using the MultiViews option makes Apache create a type map for files in a directory matching whatever was requested, using their mime types and language extensions to determine Content-Type and Language-Type. It then performs matching as above. The advantage of type maps over MultiViews is that this extra processing to create a type map is unnecessary. The disadvantage is having to code a type map for every multi-language resource!


Server-side Includes (SSI)

So far we've only talked about offering whole Web pages in different languages. However, what if you only want to alter part of a page to be language sensitive (for example, a national flag logo or welcome message in the visitor's language)? The great news is that content negotiation works in a range of Apache processing situations. One of them is server-parsed html files.

SSI are one of the great features of Apache. One bit of HTML can be included in many different pages (for example, a navigator, title banner, or page footer) and it only exists in a single file, making site maintenance a snap. These "HTML fragments" can also exist in a number of different languages and Apache will use content negotiation to include the correct one.

Server parsed HTML files can be enabled with the following lines in httpd.conf:

AddType text/html .shtml
AddHandler server-parsed .shtml

with SSI enabled using the +Includes option in your directory directive:

<Directory /home/httpd/html>
Options MultiViews +Includes
</Directory>

The includes module (mod_include) does all the server-side parsing. Look for

LoadModule includes_module modules/mod_include.so
AddModule mod_include.c

in your httpd.conf and read up more on SSI in the Apache manual for the mod_include module (see Resources.

As an example, I created a home page which includes a welcome message:

----home.shtml-----
<h2>My Home Page</h2>
<!--#include virtual="welcome" -->
-------------------

and two different versions of the message:

----welcome.en-----
Welcome!
-------------------
----welcome.fr-----
Bonjour!
-------------------

Now, loading http://myserver/home.html into a browser will produce a page with the "My Home Page" title for all languages and a customized welcome message


Personal Home Page (PHP)

As my personal favorite server-side scripting language, it would be neat if PHP had support for this kind of language sensitivity (see Resources). Fortunately, PHP is a great language; it has the ability to perform an Apache sub request to do a virtual include just as in SSI. The virtual() function makes Apache perform the equivalent to <!--#include virtual="" -->, so, consequently, it also chooses documents based on user-specified language preferences.

A PHP version of the home page above is:

-----home.php3------
<h2>My Home Page</h2>
<?php
  virtual("welcome");
?>
--------------------

This goes for any other server-side scripting language which gets access to a virtual include function.


The manual method

If you want more control over what pages to serve, or simply fancy using different server-side code depending on your visitor's language, you can always roll your own script to mimic what mod_negotiation does.

Here is an example using PHP. The get_language function looks at the $HTTP_ACCEPT_LANGUAGE variable which holds the contents of the Accept_Language header and splits it into an array of preferred languages, with the highest preference first in the array.

The $language_pages array is associative, relating language keywords with some string to return to the caller. For each language keyword preference, the function checks $language_pages to find a key match. When a match is found, the value is returned.

If the visitor didn't specify a preference, then the function simply returns a default language and if it doesn't find their preferred language, it returns another default value (both of which happen to be "english" in this example).

<?php
function get_language()
{
  global $HTTP_ACCEPT_LANGUAGE;
  $language_pages = array(
                      "en"=>"english",
                      "fr"=>"french"
                    );
  $language_default = "english";
  $language_nofound = "english";
  // get preferred languages in the "Accept-Language" header
  if($HTTP_ACCEPT_LANGUAGE == "")
  {
    // no preference set
    return $language_default;
  }
  // form an array of preferred languages
  $accept_language = str_replace(" ", "", $HTTP_ACCEPT_LANGUAGE);
  $languages = explode(",", $accept_language);
  // check for a recognised language
  for($i = 0; $i < sizeof($languages); $i++)
  {
    if($language_pages[$languages[$i]] != "")
    {
      // found a preferred language
      return $language_pages[$languages[$i]];
    }
  }
  return $language_nofound;
}
echo get_language();
?>

In this example, I've coded the associative array with only two languages -- English and French (en and fr), returning the strings english and french when the visitor's language preference is matched. I could imagine a more complex use of this function to determine an HTML tag for the name of an image, or perhaps the name of a PHP function to include based upon language. For this example, I printed out the name of the matched language.


Conclusion

As you can see, Apache provides a powerful mechanism to customize Web pages to a visitor's own language, offering a more personalized and nationalized viewing experience. With a few httpd.conf changes, your Web site can start to offer some of the same!


Resources

About the author

David Seager has been playing with Linux and Web-based applications for over 2 years.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development
ArticleID=11486
ArticleTitle=Language support in Apache through negotiation
publish-date=02012001
author1-email=
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers