UTF-8 support for uniform resource locators

There are a number of different encoding methods for transmitting characters outside the printable ASCII range. WebSEAL, acting as a web proxy, must be able to handle all these cases. The UTF-8 locale support addresses this need.

Browsers are limited to a defined character set that can legally be used in a uniform resource locator (URL). This range is defined to be the printable characters in the ASCII character set (between hex code 0x20 and 0x7e). For languages other than English, and other purposes, characters outside the printable ASCII character set are often required in URLs. These characters can be encoded by using printable characters for transmission and interpretation.

The manner in which WebSEAL processes the URLs from browsers can be specified in the WebSEAL configuration file.

[server]
utf8-url-support-enabled = {yes|no|auto}

The three possible values are as follows:

The following list is a sample deployment strategy.

  1. Unless required for content purposes, immediately check and set the default-webseal ACL on existing production deployments to NOT allow unauthenticated r access. This setting limits security exposure to users who have a valid account in the Security Access Manager domain.
  2. Ensure that the utf8-url-support-enabled stanza entry is set to the default value of yes.
  3. Test your applications. If they function correctly, use this setting.
  4. If any applications fail with Bad Request errors, try the application with the utf8-url-support-enabled stanza entry set to no. If this step works, you can deploy with this setting. Ensure, however, that no junctioned web server is configured to accept UTF-8 encoded URLs.
  5. If the application continues to have problems, try setting utf8-url-support-enabled to auto.