The X-Face header for e-mail and net-news messages specifies a 48×48 bi-level icon associated with the message’s sender. The icon’s foreground colour is usually displayed as black and its background colour is usually displayed as white or transparent. This document has been created by reverse-engineering the code of compface.
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL
NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”,
“MAY”, and “OPTIONAL” in this document are to be interpreted as
described in BCP 14 when, and only when, they
appear in all capitals, as shown here.
Overview of the encoder and decoder
The encoder converts a 48×48 bi-level image into an X-Face header. The decoder converts an X-Face header into the corresponding 48×48 image.
There are three steps used in the encoding process:
- A prediction algorithm predicts the value of each pixel. The predictions are logically XORed with the pixels’ actual values, producing a 48×48 prediction correctness bitmap, hereafter shortened to “PCBitmap”.
- The PCBitmap is entropy-encoded into an integer.
- The integer is written in base-94 and placed into an RFC 5322 header with the name
X-Face
.
The decoder reverses the encoding steps:
- The contents of the header are extracted and interpreted as a base-94 integer.
- The integer is entropy-decoded into the PCBitmap.
- The same prediction algorithm predicts the value of each pixel, and the predictions are logically XORed with the PCBitmap, producing the original image.
The prediction algorithm
For the purposes of the prediction algorithm:
- Background-colour pixels have a value of 0 and foreground-colour pixels a value of 1.
- Each pixel in the image has an X position and a Y position. The X position ranges from 0 (left) to 47 (right), and the Y position from 0 (top) to 47 (bottom).
- Each pixel has an index. The index of a pixel is 48Y+X.
- P(n) means the value of the pixel with the index n less than the current pixel’s.
- a<<b is equivalent to a×2ᵇ.
- | is the bitwise logical OR operator.
- The << operator takes precedence over the | operator.
Given a full or partial 48×48 image, the prediction algorithm returns a prediction based on the X, Y, index, and the values of pixels with lower indices, using the following expressions and conditions.
- Index≤49
- X=0, Y≥3
- 0
- 50≤Index≤94
- P(1)
- Index=95
- Logical AND of P(1) and P(2)
- Index=96
- P(47)
- Index=97
- The (P(48)<<2 | P(47)<<1 | P(46))th zero-indexed digit of the following sequence:
00010111
- Index=98
- The (P(49)<<4 | P(1)<<3 | P(48)<<2 | P(47)<<1 | P(46))th zero-indexed digit of the following sequence:
00000001000100110000001101111111
- 99≤Index≤142
- The (P(50)<<6 | P(2)<<5 | P(49)<<4 | P(1)<<3 | P(48)<<2 | P(47)<<1 | P(46))th zero-indexed digit of the following sequence:
00110111011100110000000000011001010101110111111111110101111110110111000000110011111100001111100101111111111111111111111111111111
- Index=143
- The (P(50)<<5 | P(2)<<4 | P(49)<<3 | P(1)<<2 | P(48)<<1 | P(47))th zero-indexed digit of the following sequence:
0000000100000001000000010001111100000011000111110011111111111111
- X=1, Y≥3
- The (P(96)<<5 | P(48)<<4 | P(95)<<3 | P(47)<<2 | P(94)<<1 | P(46))th zero-indexed digit of the following sequence:
0000010000000000000000010000000101000011001011101111111100111111
- X=2, Y≥3
- The (P(97)<<8 | P(49)<<7 | P(1)<<6 | P(96)<<5 | P(48)<<4 | P(95)<<3 | P(47)<<2 | P(94)<<1 | P(46))th zero-indexed digit of the following sequence:
00000000000000000000000000000000010100000000000011110011010111111000010000000100000101111001111100000100001000110000010111111111000000000000000000000000000000100000001100000011001100111101011100000101000000110101111100111111000101110011001111111111111111110000000010000000000000100000010000010010000000000001000101010111000001010010010100000101000000110011010110111111100111111111111100000111011011110010000001000000000101110000011011111010111010000000000100000111000111111001111100011111111111111111111111111111
- 3≤X≤46, Y≥3
- The (P(98)<<11 | P(50)<<10 | P(2)<<9 | P(97)<<8 | P(49)<<7 | P(1)<<6 | P(96)<<5 | P(48)<<4 | P(95)<<3 | P(47)<<2 | P(94)<<1 | P(46))th zero-indexed digit of the following sequence:
0000000000000000000000010000000100000000000000001110001111011111000001010001011100000101000011110000000000011011000011111101111100000000000001000000000000000000000011010000111100000011011111110000000000000000000000000000000100000000000111010100010100101111000000000000000000000000000011010000000000001010111111111111111100000000000001000000000000000101000000010011111111001111111111110001000000000001100000001100100100001111000011111111111111111111000000000000000000000000000000000001101100011111111111111111111101001111010101000000011100011111010101110100011111010111001111011111111111111111010111110001111101111111111111110111111101111111000001010000111100000001000011110000111101011111100110111101111101111111111111110101111100011101010111111111111100001111000111110000111101011111000000110001111101001111010111111111011101111111011111111111111100001101000011111111101111111111111101111011111100001111010011111101011100111111010011110111111111111111111111110110011110111111010101100010010100011111011111111001111111111111000000000000000000000000000001010101111101111111000000011101111100010100000000000000010100001111000001111010001000001001000011110000000000000000000000000000000000001111010111110001100011010111100101000111000100000000000001010001111110110111000011000000011100001111000011110000000000001111000011110001111110000100100011110000010100010101000001010000111101001111111111111000011111011111000001010000000100010000000000000000111100001111000000000000100000000101000001000000010000000001010011111111111110011111100011110100101001000000010111110101111111111111111111101101111111111111011111111111011111111111011111111111111111111111011110111111111100001111111111011101011101011111010011110111111101111111110111111111111111111111111111111111111111111111011101111101111101111111010011111110111111111111111111110111011111111111111111111111111101101111111111110000111101001111111111111111111110011101111111110000111111101111111111111101111101101111111111111111111111111111010011111111111111001101000011110100111111111111111111111101111100000000000000000000000000001011000001010000001000000010000011110000010000000000000000000000110000000001000001100000000000001111001000000000001100000000000000000000010100001111010000000000100000000000000000000000000000000001000000000000000100001100000011110000000100000000100000000000000000000000000000001000000000000000000000000001010000000001000001010000000100010101101011110000111100000000000000010001000000000000000010000000000001000110000011000010000000000000100010000000000000001111000101011111111111011111000000100000000000000000000011110111111101011111110110111111111101001111001111100000010100001111011111111111011110010101010011110000110100001111000000010000111101001111010111111001111111011111001001010000111000001101000011010100111101111111100011110000111100001111111110100000010001001111010011111111111111110111011101110100011111101101000001010000111111111111111111111101111111111111010011110110111111011000010111110000111101111111110111110101111100000111000011111001010000001101000111111111111111111111111111110000000000000010000000000000001101000110010101110000000100001101000000010000100000000001000011110100011101101100000011010000111100000010000000000000000000000000000010110100111100000000000010000000010100000000100101010000000100001111011111110000110000001111000000010000111000000000000000000000111101000001000000000000000000000100001001000000110100001111000011110111111111001111110111110000000000000000000000000000000000000100010000000000000000000000000001100010011011001111000001011100111101111111110111111101111100000000000000000001011101011111111111111111110111111111111111110100011000001001010011110101111101111111111111011101111111111111000010101000100010100111011111110111111111111111111111111111111100001111000001001101111101111111010011111111111110011111111111110000111011100110110111111111111101111111111111111111111111111111000011111110110010001111010011110111111111111111110111111111111100001111110011111101111111111111011011110111111111111111111111110000001100001100100111010000111101111111111111111111111111111111
- X=47, Y≥3
- The (P(98)<<9 | P(50)<<8 | P(2)<<7 | P(97)<<6 | P(49)<<5 | P(1)<<4 | P(96)<<3 | P(48)<<2 | P(95)<<1 | P(47))th zero-indexed digit of the following sequence:
0000000000001111000000000000100100000000000011010000000000001101000000000000111100000000010011101110010000001101000100000000111100000000000011110100010001001111000000000001111000001111000011111010111010101111010001010111111111101111111111110000111111111111000000000000100100000001000100010000000000000001000111001101110100000000000101010000000011111111000000000001000000000000111111010000000000001111010011110101111100111101111111111111111111111111010011111111111100011100111111111101111111111111100011111111111100000000000011010000000000000000000000000001010100000001000001110000000000000001000000100001111100000001000100010000010101111111000000000001111101000001010101110001111111111111000001010111011100001101010111110100110111111111010011111111111100001111111111110000000000000000000000100000010100000000000100010000010101111101000100000001010100101111111111110100000001010000000011011111110100000100000011110000011100011111000001110111111100001111101111110000110101111111000011111111111101001101011111010000111111111111
The entropy coder
For the purposes of the entropy coder:
- The PCBitmap is conceptually divided into a grid of 2×2 areas.
- The PCBitmap is also subdivided into cells several times. At level 1 of the subdivision, the PCBitmap is divided into a grid of 9 cells. At levels 2, 3, and 4, each cell at the previous level is divided into a grid of 4 sub-cells. The cells at level 4 each contain one 2×2 area. The PCBitmap itself is the super-cell to the level 1 cells.
- The cells at each level have an order. At level 1, the order is: top left, top centre, top right, centre left, centre, centre right, bottom left, bottom centre, bottom right. At levels 2, 3, and 4, the order of the 4 cells within their super-cell is: top left, top right, bottom left, bottom right.
- During the integer-PCBitmap conversion processes as described in this document, there is a variable called the current cell. It refers to one of the several cells just described.
The encoder entropy-encodes one range [rstart, rstart+rlen) into an existing entropy-coded integer i, changing it into entropy-coded integer i′, using this formula:
The decoder entropy-decodes one range [rstart, rstart+rlen) from entropy-encoded integer i′, changing it into entropy-encoded integer i, by first using the property that rstart ≤ i′ mod 256 < rstart+rlen to look up rstart and rlen, then using this formula to find i:
Two tables of ranges are used in the conversion process: the cell state table and the 2×2 area state table.
The cell state table
Subdivision level
| Other
| All contained 2×2 areas are fully correct
| All contained 2×2 areas are fully or partially incorrect
|
Level 1
| [0, 251)
| [251, 255)
| [255, 256)
|
Level 2
| [0, 200)
| [200, 255)
| [255, 256)
|
Level 3
| [0, 159)
| [159, 223)
| [223, 256)
|
Level 4
| N/A
| [131, 256)
| [0, 131)
|
The 2×2 area state table
Top left
| Top right
| Bottom left
| Bottom right
| Range
|
Incorrect
| Correct
| Correct
| Correct
| [0, 38)
|
Correct
| Incorrect
| Correct
| Correct
| [38, 76)
|
Correct
| Correct
| Incorrect
| Correct
| [76, 114)
|
Correct
| Correct
| Correct
| Incorrect
| [114, 152)
|
Incorrect
| Incorrect
| Correct
| Correct
| [152, 165)
|
Incorrect
| Correct
| Incorrect
| Correct
| [165, 178)
|
Correct
| Incorrect
| Incorrect
| Correct
| [178, 191)
|
Incorrect
| Correct
| Correct
| Incorrect
| [191, 204)
|
Correct
| Incorrect
| Correct
| Incorrect
| [204, 217)
|
Correct
| Correct
| Incorrect
| Incorrect
| [217, 230)
|
Incorrect
| Incorrect
| Incorrect
| Correct
| [230, 236)
|
Incorrect
| Incorrect
| Correct
| Incorrect
| [236, 242)
|
Incorrect
| Correct
| Incorrect
| Incorrect
| [242, 248)
|
Correct
| Incorrect
| Incorrect
| Incorrect
| [248, 253)
|
Incorrect
| Incorrect
| Incorrect
| Incorrect
| [253, 256)
|
The decoder converts an entropy-encoded integer into a PCBitmap by the following process:
- The current cell is initially the first cell of subdivision level 1.
-
Entropy-decode and interpret one range according to the cell state table at the current cell’s subdivision level.
- If “All contained 2×2 areas are fully correct”:
- Set every pixel in the current cell to be correct.
- Otherwise, if “All contained 2×2 areas are fully or partially incorrect”:
- For each 2×2 area within the current cell, in an order following a depth-first traversal, entropy-decode one range according to the 2×2 area state table, and set the 2×2 area’s pixels as indicated by the interpreted range.
- Otherwise:
- Change the current cell to the first sub-cell of the current cell.
Go to the beginning of step 2.
-
- If the current cell is not the last sub-cell of its super-cell:
- Change the current cell to the next cell at the same subdivision level.
Go to the beginning of step 2.
- Otherwise, if the current cell’s subdivision level is not level 1:
- Change the current cell to the super-cell of the current cell.
Go to the beginning of step 3.
- Otherwise:
- The process is complete.
The encoder converts a PCBitmap into an entropy-encoded integer by the following process:
- The current cell is initially the last cell of subdivision level 1.
The entropy-encoded integer is initially zero.
-
Determine the state of every 2×2 area within the current cell.
- If all 2×2 areas within the current cell are fully correct:
- Entropy-encode one range meaning “All contained 2×2 areas are fully correct” at the current cell’s subdivision level in the cell state table.
- Otherwise, if all 2×2 areas within the current cell are fully or partially incorrect:
- For each 2×2 area within the current cell, in an order following a depth-first traversal in reverse, entropy-encode one range corresponding to the state of the 2×2 area in the 2×2 area state table.
Entropy-encode one range meaning “All contained 2×2 areas are fully or partially incorrect” at the current cell’s subdivision level in the cell state table.
- Otherwise:
- Change the current cell to the last sub-cell of the current cell.
Go to the beginning of step 2.
-
- If the current cell is not the first sub-cell of its super-cell:
- Change the current cell to the previous cell at the same subdivision level.
Go to the beginning of step 2.
- Otherwise, if the current cell’s subdivision level is not level 1:
- Change the current cell to the super-cell of the current cell.
Entropy-encode one range meaning “Other” at the current cell’s subdivision level in the cell state table.
Go to the beginning of step 3.
- Otherwise:
- The process is complete.
The base-94 interpretation
The base-94 digits used are the printable ASCII characters (!
through ~
), in ASCII order. All other characters are ignored by the decoder. The encoder SHOULD insert whitespace to fold the header in accordance with RFC 5322’s line length limits.
Because there exist X-Face implementations written without RFC 2047 in mind, an encoder SHOULD avoid creating an X-Face header that can be interpreted as containing an RFC 2047 <encoded-word>.
Legitimate <encoded-word>s can be avoided by using only ASCII characters, and unintentional <encoded-word>s can be avoided by inserting whitespace in the two-character string that delimits the start or end of what would otherwise be an <encoded-word>.
The decoder MAY ignore the base-94 digits after the first 666 base-94 digits of the header body. The decoder MAY ignore any characters after the first 2048 bytes of the header body.