Skip to main content

Caret types

A caret (^) prefix can be added to a Hybrix type, for example ^IO_DYNAMIC instead of IO_DYNAMIC. This instructs the garbage collector to ignore variables of that type, called a caret type. A variable that has a caret type is called a caret pointer.

To understand this feature, let's consider how the garbage collector works. Suppose we define a class like this:

# AN ITEM IN A LINKED LIST
CLASS LIST_ITEM
VAR ID: INT
VAR NEXT_ITEM: LIST_ITEM
END CLASS

When your program creates a new item using a statement like NEW LIST_ITEM() -> ITEM, a memory block gets allocated from the garbage collector's heap memory. Later, the garbage collector will trace the heap blocks, following every pointer to discover every reachable object; any unreachable objects ("garbage") can safely be freed ("collected").

The garbage collector cannot access the compile-time type information for our LIST_ITEM class. Instead, each memory block includes a special 4-byte header. Using this header, the garbage collector can determine that a pointer (NEXT_ITEM) is located at ITEM's address plus 4 bytes (the size of ID). The header is located at ITEM's address minus 4 bytes.

Caret types from LOCATED variables

Now consider a variable like KERNEL::CONSOLE_GRID:

MODULE KERNEL
# THE TEXT CONSOLE USES 8 X 8 TILES, WITH ROOM FOR 40 ROWS AND 28 COLS,
# HOWEVER ITS TILEMAP GRID HAS 64 ROWS X 32 COLS = 2048 PAIRS.
# (THERE ARE 24 UNUSED COLUMNS ON THE RIGHT AND 4 UNUSED ROWS ON THE BOTTOM.)
INSET CONSOLE_GRID: ^PAIR[SIZE 2048] LOCATED AT $10_0000 # ..$10_0FFF
. . .
END MODULE

This variable is not part of the heap. It starts at address $10_0000, the beginning of the segment. If the garbage collector were to look for a header at address $10_0000 minus 4, it would read from invalid memory, causing the program to crash. Thus, it's very important that CONSOLE_GRID should never end up in a regular variable.

The caret type provides this guarantee: The garbage collector will not analyze ITEM because GPUSH instructions are not emitted for caret types. Also, the type system prevents a caret pointer from ever getting assigned to a regular pointer.

LOCATED outside the RAM segment

You may notice that none of the IO variables have a caret. Here's one example:

MODULE IO
. . .
INSET TILESET_A_ADDRESSES: INT[SIZE 1024] LOCATED AT $D0_1000 # ..$D0_1FFF
. . .
END MODULE

When the garbage collector is tracing the heap blocks, it will automatically ignore any pointer outside the RAM segment, in other words any memory address that does not start with $1. For example, DATA definitions are stored in the ROM segment (whose addresses start with $C), therefore they are never analyzed. As a result, it is generally safe to store those pointers in a regular variable without a caret. The type system considers the LOCATED AT memory address: if it starts with $1 then a caret is required; otherwise a caret is not allowed.

Important: Suppose your program allocates a tile image as an array and stores its address in IO::TILESET_A_ADDRESSES. Later, if the array variable goes out of scope, the garbage collector may free the array, even though IO::TILESET_A_ADDRESSES is still referencing its memory. The result would be memory corruption, where the tile picture might appear scrambled.

To avoid this, it is the program's responsibility to use variables to keep objects "alive" while in use by IO devices. To make this more obvious, the IO pointers are always represented as INT memory addresses rather than pointer types.

Interior pointers

Caret types can also arise for objects allocated in heap memory. Consider the IO_CHANNEL_EFFECT class:

CLASS IO_DYNAMIC # 5 BYTES
VAR CONTROL: BYTE
VAR LOW: PAIR
VAR HIGH: PAIR
END CLASS

CLASS IO_CHANNEL_EFFECT # SIZE 30
INSET DISTORTION_GAIN: IO_DYNAMIC
INSET DELAY_SAMPLES: IO_DYNAMIC
INSET DELAY_FEEDBACK: IO_DYNAMIC
INSET DRY_LEVEL: IO_DYNAMIC
INSET PREDELAY_LEVEL: IO_DYNAMIC
INSET POSTDELAY_LEVEL: IO_DYNAMIC
END CLASS

As discussed in the previous section, INSET causes each IO_DYNAMIC field to be directly embedded in the IO_CHANNEL_EFFECT object (instead of being pointers to separately allocated objects). Suppose we want to make a function whose parameter is an IO_DYNAMIC:

MODULE MAIN
FUNC RESET_DYNAMIC(DYNAMIC: ^IO_DYNAMIC)
0 -> DYNAMIC.CONTROL
0 -> DYNAMIC.LOW
0 -> DYNAMIC.HIGH
END FUNC

FUNC START()
VAR EFFECT: IO_CHANNEL_EFFECT
NEW IO_CHANNEL_EFFECT() -> EFFECT
MAIN::RESET_DYNAMIC(EFFECT.DELAY_FEEDBACK)
END FUNC
END MODULE

When we call RESET_DYNAMIC(), its DYNAMIC parameter is now a pointer into the middle of the memory block for EFFECT, because that is where IO_DYNAMIC is embedded. This is called an interior pointer. Its address is in the heap memory, but it is not the start of a normal heap block. If the garbage collector looked for a memory block header at the address of EFFECT.DELAY_FEEDBACK minus 4 bytes, it would be reading bytes from DELAY_SAMPLES leading to memory corruption.

To avoid this problem, interior pointers always produce caret types. Why are they not declared as caret types, though? It's because caret types also represent a liveness proof, which must be checked when promoting to a caret type.

Liveness proof

Consider this code:

MODULE MAIN
FUNC RESET_DYNAMIC(DYNAMIC: ^IO_DYNAMIC)
0 -> DYNAMIC.CONTROL
0 -> DYNAMIC.LOW
0 -> DYNAMIC.HIGH
END FUNC

FUNC START()
VAR EFFECT: IO_CHANNEL_EFFECT
NEW IO_CHANNEL_EFFECT() -> EFFECT

VAR DELAY_FEEDBACK: ^IO_DYNAMIC
# ⚠️ ERROR: "IO_DYNAMIC CANNOT BE PROMOTED TO ^IO_DYNAMIC IN THIS CONTEXT"
EFFECT.DELAY_FEEDBACK -> DELAY_FEEDBACK

# (IO_CHANNEL_EFFECT MIGHT GET FREED HERE)
NULL -> EFFECT
ENGINE::INIT()

MAIN::RESET_DYNAMIC(DELAY_FEEDBACK)
END FUNC
END MODULE

The promotion from IO_DYNAMIC to ^IO_DYNAMIC in EFFECT.DELAY_FEEDBACK -> DELAY_FEEDBACK seems very similar to MAIN::RESET_DYNAMIC(EFFECT.DELAY_FEEDBACK) from the previous example, but there is an important difference: We were relying on the local variable EFFECT to keep the IO_CHANNEL_EFFECT object alive, but other statements might modify the variable (NULL -> EFFECT), and other statements (ENGINE::INIT()) might trigger a garbage collection. If that happened, then MAIN::RESET_DYNAMIC(DELAY_FEEDBACK) could receive a pointer to freed memory, and its action would corrupt the heap. This was impossible in the previous example, because the EFFECT local variable couldn't change during the call to RESET_DYNAMIC(), and the caret pointer's lifetime was limited to that one statement.

To guarantee memory safety, the compiler only allows a pointer to become a caret type if it can prove that it's impossible for the memory to get collected during the lifetime of the caret type. The compiler's analysis is currently very simplistic; for example, removing NULL -> EFFECT will not eliminate the compiler error. More sophisticated analysis may be implemented in the future.

To summarize, caret types represent two guarantees about a pointer:

  1. Protection against the garbage collector inadvertently analyzing the caret pointer.
  2. A liveness proof that the object memory can't get collected during the lifetime of the caret pointer.

Rules for caret types

Rules that the compiler enforces for caret types:

  • Caret pointers can never be assigned to non-caret pointers.
  • Non-caret pointers can be promoted to caret pointers only if the compiler can prove that it is safe. Function call arguments and LOCATED AT variables are generally safe.
  • The caret (^) operator can only be applied to CLASS or array types. It cannot be applied to non-pointer types such as INT or BOOL.
  • Caret pointers can only exist as local variables or expressions; in other words, they must be stored on the Chombit stack. Caret pointers cannot be stored in module variables or class member variables.
  • A function's return value cannot be a caret pointer.
  • If the caret is applied to an array type, it must be a fixed-size array.
  • Class members cannot be declared with a caret type (but they can become a caret type in expressions where the compiler can prove that it is safe).
  • A LOCATED AT variable must be declared using a caret type if the address is in the RAM segment; otherwise it must not have a caret.