MIME standard header fields

Check this quick reference to the common MIME headers.

This information does not provide a definitive specification of MIME. In some cases, the MIME parser allows documents that are not strictly valid according to the standard. For example, it does not insist on the presence of a MIME-Version header. All the standard MIME header fields are simply written to the logical tree as they appear in the MIME document. The MIME parser takes special note only of the Content-Type header field.

All MIME headers can include comments enclosed by parentheses, as shown in the example for the MIME-Version header.

MIME header fields

MIME-Version

Example:

MIME-version: 1.0 (generated by my-application 1.2)

For a MIME document to conform with RFC 2045, this field is required in the top-level header with a value of 1.0. MIME-Version should not be specified on individual parts.

Content-Type

Content-Type is not required for a document to conform with RFC 2045, but a top-level Content-Type is required by the MIME parser. Content-Type defaults to text/plain. Content-Type defines the type of data in each part as a type/subtype. The MIME parser accepts most values for Content-Type and stores them in the logical tree. The only exceptions are:

  • The MIME parser rejects any Content-Type value with type = message.
  • The MIME parser assumes that a Content-Type value with type = multipart introduces a multipart MIME document, and rejects such a value if it does not contain a valid boundary parameter. The value of the boundary parameter defines the separator between message parts in a multipart message. In a nested multipart message, a unique boundary value is required for each nesting level.

Syntax:

Content-Type: type/subtype;parameter

where type and subtype define the Content-Type, and all optional parameters are delimited by semicolons.

Example 1:

Content-Type: multipart/related;type=text/xml

In example 1, the Content-Type is defined as multipart/related, and also has an optional parameter definition (type=text/xml). Although this structure is syntactically correct, because a valid boundary parameter does not exist, this message is rejected.

Example 2:

Content-Type: multipart/related;boundary=Boundary;type=text/xml 

Example 2 shows a valid Content-Type definition, both in terms of syntax and semantics. The boundary value optionally can be enclosed in quotation marks. When it appears in the MIME body, the value is preceded by the sequence '--', and you must ensure that the resulting value (in this example, --Boundary) cannot appear in the message body. If the message data is encoded as quoted-printable, you must include a boundary that includes a sequence such as =_, which cannot appear in a quoted-printable body.

The following table shows some common Content-Type values. Other values are allowed, and stored in the logical tree.

Content-Type Description
text/plain Typically used for a typical mail or news message. text/richtext is also common.
text/xml Typically used with SwA (SOAP with Attachments).
application/octet-stream Used where the message is an unknown type and contains any kind of data as bytes.
application/xml Used for application-specific xml data.
x-type Used for non-standard content type. It must start with the characters x-.
image/jpeg Used for images. image/jpeg and image/gif are common image formats that are used
multipart/related Used for multiple related parts in a message. Specifically used with SwA (SOAP with Attachments)
multipart/signed Used for multiple related parts in a message including signature. Specifically used with S/MIME
multipart/mixed Used for multiple independent parts in a message
Content-Transfer-Encoding

Optional. Many Content-Types are represented as 8-bit character or binary data, and can include XML, which typically uses UTF-8 or UTF-16 encoding. This type of data cannot be transmitted over some transport protocols, and might be encoded to 7-bit.

The Content-Transfer-Encoding header field is used to indicate the type of transformation that has been used for encoding this type of data into a 7-bit format.

The following values only are allowed by the WS-I Basic Profile:

  • 7bit - the default
  • 8bit
  • binary
  • base64
  • quoted-printable

The values 7bit, 8bit, and binary all effectively mean that no encoding took place. A MIME conformant mail gateway might use this value to control how it handles the message. For example, encoding it as 7bit before passing routing it over SMTP.

The values base64 and quoted-printable mean that the content has been encoded. The value quoted-printable means that only non-7-bit characters in the original are encoded, and is intended to yield a document which is still human-readable. This setting is most likely to be used in conjunction with a Content-Type of text/plain.

Content-ID

Optional. This field enables parts to be labeled, and referenced from other parts of the message. These parts are typically referenced from part 0 (the first) of the message.

Content-Description

Optional. This field enables parts to be described.

MIME encodings

The following section provides a basic guide to the base64 and quoted-printable encoding; refer to RFC 1521 (linked at the end of this topic) for a definitive specification of MIME encodings.

base64

The original data is broken into groups of 3 octets. Each group is then treated as 4 concatenated 6-bit groups, each of which is translated into a single digit in the base64 alphabet. The base64 alphabet is A-Z, a-z, 0-9, and / (with A=0 and /=63).

This diagram shows how 8-bit data is broken down into 6-bit encoded data.

If fewer than 24 bits are available at the end of the data, the encoded data is padded using the "=" character. The maximum line length in the encoded data is 76 characters and line breaks (and all other characters not in the alphabet above) are ignored when decoding.

Examples:

Input Output
Some data encoded in base64. U29tZSBkYXRhIGVuY29kZWQgaW4gYmFzZTY0Lg==
life of brian bGlmZSBvZiBicmlhbg==
what d2hhdA==
quoted-printable

This encoding is appropriate only if most of the data comprises printable characters. Specifically, characters in the ranges 33-60 and 62-126 are typically represented by the corresponding ASCII characters. Control characters and 8-bit data must be represented by the sequence = followed by a pair of hexadecimal digits.

The standard ASCII space <SP> and horizontal tab <HT> represent themselves, unless they appear at the end of an encoded line (without a soft line break), in which case the equivalent hexadecimal format must be used (=20 and =09 respectively).

Line breaks in the data are represented by the RFC 822 line break sequence <CR><LF> and should be encoded as "=0D=0A" if binary data is being encoded.

For base64, the maximum line length in the encoded data is 76 characters. An ‘=' sign at the end of an encoded line (a ‘soft' line break) is used to tell the decoder that the line is to be continued.