XHEX encoding
Chombit uses a proprietary scheme called expanded hexadecimal (XHEX) to compress 32-bit integers such as $D000_0000 into a single byte, if almost all of their digits are 0 or F. Base addresses and bitmasks are typically encodable using XHEX.
Consider this assembled code:
$C0_0000: FC 00 A0 00 00 PUSH INT $00A0_0000
$C0_0005: 32 01 POP I:4
$C0_0007: 43 01 5A MOVE I:4, XHEX $00A0_0000
$C0_000A: 43 02 B8 MOVE I:8, XHEX $FFFF_8FFF
Normally, assigning $00A0_0000 to I:4 requires 7 bytes of assembly language, as a combination of PUSH and POP instructions seen above. We can't do MOVE I:4, $00A0_0000 because that would require 6 bytes (one for the MOVE opcode, one to indicate I:4, and four bytes to represent $00A0_0000). The Chombit hardware does not permit instructions longer than 5 bytes.
But in this case, we are allowed to use MOVE I:4, XHEX $00A0_0000. This form only requires 3 bytes—a significant reduction in size. XHEX compresses the number $00A0_0000 into a single byte $5A.
Encoding scheme
How does it work? A 32-bit number can be represented as XHEX in two circumstances:
- Duplicated 0s: if seven of the eight hex digits have the value
0(for example$00A0_0000); or - Duplicated Fs: if seven of the eight hex digits have the value
F(for example$FFFF_8FFF)
The interesting digit becomes the right nibble of the encoded byte. In our example $00A0_0000, the interesting digit is A.
The left nibble indicates the position of the interesting digit. There are five 0s after A in $00A0_0000, so the left nibble is 5. Thus, $00A0_0000 becomes $5A.
What about $FFFF_8FFF? With duplicated 0s, $0000_8000 would be encoded as $38. To indicate duplicated Fs, we set the high bit by adding $80. Thus, $38 + $80 gives $B8 as the encoding for $FFFF_8FFF.
Notes
Here are some more examples:
XHEX $C000_0000=$7CXHEX $0000_0006=$06XHEX $0030_0000=$53XHEX $FF3F_FFFF=$D3XHEX $0000_0000=$00XHEX $FFFF_FFFF=$8F
Some numbers have more than one representation. For example, $8F and $FF both decode to $FFFF_FFFF.