XHEX encoding
Chombit uses a proprietary scheme called expanded hexadecimal (XHEX) to compress 32-bit integers such as $d000_0000 into a single byte, if almost all of their digits are 0 or f. Base addresses and bitmasks are typically encodable using XHEX.
Consider this assembled code:
$c0_0000: fc 00 a0 00 00 push int $00a0_0000
$c0_0005: 32 01 pop i:4
$c0_0007: 43 01 5a move i:4, xhex $00a0_0000
$c0_000a: 43 02 b8 move i:8, xhex $ffff_8fff
Normally, assigning $00a0_0000 to i:4 requires 7 bytes of assembly language, as a combination of push and pop instructions seen above. We can't do move i:4, $00a0_0000 because that would require 6 bytes (one for the move opcode, one to indicate i:4, and four bytes to represent $00a0_0000). The Chombit hardware does not permit instructions longer than 5 bytes.
But in this case, we are allowed to use move i:4, xhex $00a0_0000. This form only requires 3 bytes—a significant reduction in size. XHEX compresses the number $00a0_0000 into a single byte $5a.
Encoding scheme
How does it work? A 32-bit number can be represented as XHEX in two circumstances:
- Duplicated 0s: if seven of the eight hex digits have the value
0(for example$00a0_0000); or - Duplicated Fs: if seven of the eight hex digits have the value
f(for example$ffff_8fff)
The interesting digit becomes the right nibble of the encoded byte. In our example $00a0_0000, the interesting digit is a.
The left nibble indicates the position of the interesting digit. There are five 0s after a in $00a0_0000, so the left nibble is 5. Thus, $00a0_0000 becomes $5a.
What about $ffff_8fff? With duplicated 0s, $0000_8000 would be encoded as $38. To indicate duplicated Fs, we set the high bit by adding $80. Thus, $38 + $80 gives $b8 as the encoding for $ffff_8fff.
Notes
Here are some more examples:
xhex $c000_0000=$7cxhex $0000_0006=$06xhex $0030_0000=$53xhex $ff3f_ffff=$d3xhex $0000_0000=$00xhex $ffff_ffff=$8f
Some numbers have more than one representation. For example, $8f and $ff both decode to $ffff_ffff.