File notes/key_encoding.txt artifact 88669ce771 on branch trunk


This note describes how record keys are encoded.  The encoding is designed
such that memcmp() can be used to sort the keys into their proper order.

A key consists of a table number followed by a list of one or more SQL 
values.  Each SQL value in the list has one of the following types:  NULL, 
numeric, text, or binary.  Keys are compared value by value, from left to 
right, until a difference if found. The first difference determines the
key order.

The table number is a varint that identifies the table to which the key
belongs.  Table numbers always sort in ASCENDING order.

Each SQL value has a sort-order which is either ASCENDING or DESCENDING.  The
default and the usual case is ASCENDING.

To generate the encoding, the SQL values of the key are visited from left
to right.  Each SQL value generates one or more bytes that are appended
to the encoding.  If the SQL value is DESCENDING, then its encoding bytes
are inverted (ones complement) prior to being appended. The complete key 
encoding is the concatenation of the individual SQL value encodings, in 
the same order as the SQL values.

Two key encodings are only compariable if they have the same number of SQL
values and if corresponding SQL values have the same sort-order.

The first byte of a key past the table number will be in the range of
0x05..0x0f if ascending or 0xf0..0xfa if descending.  This leaves large
chunks of key space available for other uses.  For example, the three-byte
key 0x00 0x00 0x01 stores the schema cookie for the database as a 64-bit
big-endian integer.

NULL ENCODING

Each SQL value that is a NULL encodes as a single byte of 0x05.  Since
every other SQL value encoding begins with a byte greater than 0x05, this
forces NULL values to sort first.

TEXT ENCODING

Each SQL value that is TEXT begins with a single byte of 0x24 and ends
with a single byte of 0x00.  There are zero or more intervening bytes that
encode the text value.  The intervening bytes are chosen so that the
encoding will sort in the desired collating order.  The default sequence
of bytes is simply UTF8.  The intervening bytes may not contain a 0x00
character; the only 0x00 byte allowed in a text encoding is the final
byte.

Note that all key-encoded text with the BINARY collating sequence is simply
UTF8 text.  UTF8 not UTF16.  Strings must be converted to UTF8 so that
equivalent strings in different encodings compare the same and so that
the strings contain no embedded 0x00 bytes.  If a collating function is
used, it needs to work like ucol_getSortKey() in the ICU library.  In
other words, strcmp() should be sufficient for comparing two text keys.

The text encoding ends in 0x00 in order to ensure that when there are
two strings where one is a prefix of the other that the shorter string
will sort first.

BINARY ENCODING

Each SQL value that is BINARY begins with a single byte of 0x25 and
ends with a single byte of 0x00.  There are zero or more intervening bytes
that encode the binary value.  None of the intervening bytes may be zero.
Each of the intervening bytes contains 7 bits of blob content with a 1 in
the high-order bit (the 0x80 bit).  The final byte before the 0x00 contains
any left-over bits of the blob content.

The initial byte of a binary value, 0x0f, is larger than the initial
byte of a text value, 0x0e, ensuring that every binary value will sort
after every text value.

NUMERIC ENCODING

Numeric SQL values must be coded so as to sort in numeric order.  We assume
that numeric SQL values can be both integer and floating point values.

Simpliest cases first:  If the numeric value is a NaN, then the encoding
is a single byte of 0x06.  This causes NaN values to sort prior to every other
numeric value.  The only value that is less than a NaN is a NULL.

If the numeric value is a negative infinity then the encoding is a single
byte of 0x07.  Since every other numeric value except NaN has a larger 
initial byte, this encoding ensures that negative infinity will sort prior
to every other numeric value other than NaN.

If the numeric value is a positive infinity then the encoding is a single
byte of 0x23.  Every other numeric value encoding begins with a smaller
byte, ensuring that positive infinity always sorts last among numeric
values.  0x0d is also smaller than 0x0e, the initial byte of a text value,
ensuring that every numeric value sorts before every text value.

If the numeric value is exactly zero then then is encoded is a single
byte of 0x15.  Finite negative values will have initial bytes of 0x08
through 0x14 and finite positive values will have initial bytes of 0x16
through 0x22.

For all values, we compute a mantissa M and an exponent E.  The mantissa
is a base-100 representation of the value.  The exponent E determines
where to put the decimal point.

Each centimal digit of the mantissa is stored in a byte.  If the
value of the centimal digit is X (hence X>=0 and X<=99) then the
byte value will be 2*X+1 for every byte of the mantissa, except
for the last byte which will be 2*X+0.  The mantissa must be the
minimum number of bytes necessary to represent the value; trailing
X==0 digits are omitted.  This means that the mantissa will never
contain a byte with the value 0x00.

If we assume all digits of the mantissa occur to the right of the
decimal point, then the exponent E is the power of one hundred
by which one must multiply the mantissa to recover the original 
value.

Examples:

   Value               Exponent E    Significand M (in hex)
  --------             ----------    ----------------------
    1.0                    1          02
    10.0                   1          14
    99.0                   1          b4
    99.01                  1          b5 02
    99.0001                1          b5 01 02
    100.0                  2          02
    100.1                  2          03 02
    100.01                 2          03 01 02
    1234                   2          19 44
    9999                   2          c7 c6
    9999.000001            2          c7 c7 01 01 02
    9999.000009            2          c7 c7 01 01 12
    9999.00001             2          c7 c7 01 01 14
    9999.00009             2          c7 c7 01 01 b4
    9999.000099            2          c7 c7 01 01 c6
    9999.0001              2          c7 c7 01 02
    9999.001               2          c7 c7 01 14
    9999.01                2          c7 c7 02
    9999.1                 2          c7 c7 14
    10000                  3          02
    10001                  3          03 01 02
    12345                  3          03 2f 5a
    123450                 4          19 45 64
    1234.5                 3          19 45 64 
    12.345                 2          19 45 64
    0.123                  0          19 3c
    0.0123                 0          03 2e
    0.00123               -1          19 3c
    9223372036854775807   10          13 2d 43 91 07 89 6d 9b 75 0e

Values are classified as large, medium, or small according to the value
of E.  If E is 11 or more, the value is large.  For E between 0 and 10,
the value is medium.  For E less than zero, the value is small.

Large positive values are encoded as a single byte 0x22 followed by
E as a varint and then M.  Medium positive values are a single byte of
0x17+E followed by M.  Small positive values are encoded as a single
byte 0x16 followed by the ones-complement of the varint for -E followed
by M.

Small negative values are encoded as a single byte 0x14 followed by
-E as a varint and then the ones-complement of M.  Medium negative
values are encoded as a byte 0x13-E followed by the ones-complement of M.
Large negative values consist of the single byte 0x08 followed by the
ones-complement of the varint encoding of E followed by the ones-complement
of M.

SUMMARY

Each SQL value is encoded as one or more bytes.  The first byte of
the encoding, its meaning, and a terse description of the bytes that
follow is given by the following table:

  Content Type         Encoding
  ------------------   -----------------
  NULL                 0x05
  NaN                  0x06
  negative infinity    0x07
  negative large       0x08    ~E   ~M
  negative medium      0x13-E  ~M
  negative small       0x14    -E   ~M
  zero                 0x15
  positive small       0x16    ~-E  M
  positive medium      0x17+E  M
  positive large       0x22    E    M
  positive infinity    0x23
  text                 0x24    T
  binary               0x25    B