Additional Header Fields
Two additional header fields can be used to further describe the data in a message body: the
Content-ID and Content-Description header fields.
How MIME Data Is Encoded
MIME data can be encoded in two different ways. Each encoding method has its advantages and disadvantages, which are described later. The first method, Q encoding, is recommended for use when the characters to be encoded are in the ASCII character set; otherwise, the B encoding should be used. Both encoding/decoding methods are possible using MIME::Base64 (B) and MIME::QuotedPrint (Q).
Only a subset of the printable ASCII characters may be used in encoded-text. Space and tab characters are not allowed, so that the beginning and end of an encoded-word are obvious. The ? character is used within an encoded-word to separate the various portions of the encoded-word from one another and thus cannot appear in the encoded-text portion. Other characters are also illegal in certain contexts. For example, an encoded-word in a "phrase" preceding an address in a From header field may not contain any of the "specials" defined in RFC 822. Finally, certain other characters are disallowed in some contexts to ensure reliability for messages that pass through internetwork mail gateways.
The B encoding automatically meets these requirements. The Q encoding allows a wide range of printable characters to be used in non-critical locations in the message header (such as Subject), with fewer characters available for use in other locations.
B Base64 Encoding
The B encoding is identical to the Base64 encoding defined by RFC 1521.
Q Quoted-Printable Encoding
The Q encoding is similar to the Quoted-Printable content-transfer-encoding defined in RFC 1521. It is designed to allow text containing mostly ASCII characters to be decipherable on an ASCII terminal without decoding.
For more information on RFC 1521 and specific information about the rules used to encode data, go to:
Data encoded in the "Q" or Quoted-Printable method follows these basic rules:
- Any 8-bit value may be represented by a
=followed by two hexadecimal digits. For example, if the character set in use were ISO-8859-1, the
=character would thus be encoded as
=20. (Uppercase should be used for hexadecimal digits
A through F.)
- The 8-bit hexadecimal value 20 (for example, ISO-8859-1 SPACE) may be represented as
_(underscore, ASCII 95). (This character may not pass through some internetwork mail gateways, but its use will greatly enhance readability of Q encoded data with mail readers that do not support this encoding.) Note that the
_always represents hexadecimal 20, even if the
SPACEcharacter occupies a different code position in the character set in use.
- 8-bit values that correspond to printable ASCII characters other than
SPACEmay be represented as those characters.
Encoding and Decoding MIME with libwww
Mechanisms for encoding and decoding MIME messages are provided in the MIME::Base64 and Mime::QuotedPrint classes.
Mime::Base64 includes two functions:
decode_base64(). To use Mime::Base64 in your script, include the following line near the beginning of your script:
After called, encoding and decoding strings of MIME is quite simple. Encoding is handled by sending a string of non-encoded text (which can be stored in a variable) to the
encode_base64 routine. Here's how it's done:
$MyEncodedMime = encode_base64($MyPlainText);
$MyEncodedMime will contain the Base64 encoded version of
Decoding is handled in the same way, using
Here's an example:
$MyDecodedText = decode_base64($MyEncodedMIME);
Many popular e-mail clients, such as Netscape Mail, have standardized on Base64 method encoding to attach binary files to ASCII text e-mail messages. Let's take a look at an example of an
e-mail message generated by Netscape mail that has Base64 encoded data as an attachment