Even more compact encoding of the leap seconds list
« previous entry | next entry »
11th Jan 2017 | 20:54
Following on from my recent item about the leap seconds list I have come up with a better binary encoding which is half the size of my previous attempt. (Compressing it with deflate no longer helps.)
Here's the new binary version of the leap second list, 15 bytes displayed as hexadecimal in network byte order. I have not updated it to reflect the latest Bulletin C which was promulgated this week, because the official leap second lists have not yet been updated.
This is a string of 4-bit nybbles, upper nybble before lower nybble of each byte.
If the value of this nybble V is less than 0x8, it is treated as a pair of nybbles 0x9V. This abbreviates the common case.
Otherwise, we consider this nybble Q and the following nybble V as a pair 0xQV.
Q contains four flags,
+-----+-----+-----+-----+ | W | M | N | P | +-----+-----+-----+-----+
W is for width:
- W == 1 indicates a QV pair in the string.
- W == 0 indicates this is a bare V nybble.
M is the month multiplier:
- M == 1 indicates the nybble V is multiplied by 1
- M == 0 indicates the nybble V is multiplied by 6
NP together are NTP-compatible leap indicator bits:
- NP == 00 indicates no leap second
- NP == 01 == 1 indicates a positive leap second
- NP == 10 == 2 indicates a negative leap second
- NP == 11 == 3 indicates an unknown leap second
The latter is equivalent to the ? terminating the text version.
The event described by NP occurs a number of months after the previous event, given by
(M ? 1 : 6) * (V + 1).
That is, a 6 month gap can be encoded as M=1, V=5 or as M=0, V=0.
The "no leap second" option comes into play when the gap between leap seconds is too large to fit in 4 bits. In this situation you encode a number of "no leap second" gaps until the remaining gap fits.
The recommended way to break up long gaps is as follows. Gaps up to 16 months can be encoded in one QV pair. Gaps that are a multiple of 6 months long should be encoded as a number of 16*6 month gaps, followed by the remainder. Other gaps should be rounded down to a whole number of years and encoded as a X*6 month gap, which is followed by a gap for the remaining few months.
To align the list to a whole number of bytes, add a redundant 9 nybble to turn a bare V nybble into a QV pair.
In the current leap second list, every gap is encoded as a single V nybble, except for the 84 month gap which is encoded as QV = 0x9D, and the last 5 months encoded as QV = 0xF4.
Once the leap second lists have been updated, the latest Bulletin C will change the final nybble from 4 to A.
The code now includes decoding as well as encoding functions, and a little fuzz tester to ensure they are consistent with each other. http://dotat.at/cgi/git/leapseconds.git.