Reserved and excluded characters

To assist with the correct transmission and interpretation of an HTTP request, the use of certain characters in a URL is restricted. These characters must be converted to a safe format when the request is transmitted.

This topic is a summary about reserved and excluded characters. For more information, the Internet Society and IETF (Internet Engineering Task Force) Request for Comments document RFC 2396, Uniform Resource Identifiers (URI): Generic Syntax, lists the characters that are reserved or excluded in URIs and URLs. RFC 2396 is available from http://www.ietf.org/rfc/rfc2396.txt.

In a URI or a URL, characters that have a special purpose in the context of one or more URI or URL components are known as reserved characters. For example, the characters /, ?, &, and : are used as delimiters for various components. Machine interpreters might misinterpret the URI or URL if the reserved characters are used for any reason.

Also, certain characters are disallowed, or excluded, from use anywhere in a URI or URL, either because they are a potential cause of confusion for machine or human users, or because they are known to cause problems for some machine interpreters. For example, the space character is not permitted in a URL.

If reserved characters are wanted in a URL for any reason other than their special purpose, or if excluded characters are wanted in a URL, they must be escaped when a request containing components of the URL is sent to a server. Such characters in data that is sent in a query string must also be escaped.

Characters are escaped by being replaced with a 3-character string of the form %xx where xx is the ASCII hexadecimal representation of the reserved character. Because of this format, escaping is also known as percent-encoding.

When the request reaches the server, the server can unescape the escaped characters. Unescaping takes place only after the information in the URL and query string has been parsed, to avoid the risk of the parsing application misinterpreting the reserved or excluded characters.

Form data in a request is normally sent with special characters escaped, because the default encoding for forms (application/x-www-form-urlencoded) escapes reserved or excluded characters. Refer to HTML forms.



dfhtl20.html | Timestamp icon Last updated: Thursday, 27 June 2019