Skip to main content

Generics

🚧 This language feature is still experimental.

  • The compiler may accept invalid generic code without reporting errors.
  • The compiler may report errors for valid generic code.

Full validation will be implemented in a future release.

Hybrix functions accept parameters that range over data values. Generic programming allows a class to have type parameters which range over data types. The table below contrasts these two worlds:

Function parametersType parameters
Example declarationfunc f(x: int)class list<t>
Parameter ranges overdata valuesdata types
Example argument123string
What they are used fordetermining program behaviorchecking program correctness
When that happensruntimecompile time

Example without generics​

A queue is a common data structure where elements are added (sometimes called "enqueued") and then removed ("dequeued"), with a first-in-first-out ordering. Here's an example queue that stores text string values:

class queue_node
var item: string
var next: queue_node

constructor(item: string)
.item <- item
.next <- null
end constructor
end class

class queue
var head: queue_node
var tail: queue_node

func is_empty(): bool
return .head = null
end func

func add(item: string)
var node: queue_node
node <- new queue_node(item)

if .tail = null then
.head <- node
.tail <- node
else
.tail.next <- node
.tail <- node
end if
end func

func remove(): string
if .head = null then
kernel::fail("queue is empty")
end if

var node: queue_node
node <- .head

.head <- node.next
if .head = null then
.tail <- null
end if

return node.item
end func
end class

Example usage:

module main
func example()
var queue: queue
queue <- new queue()

# Add 3 strings
queue.add("bill")
queue.add("jane")
queue.add("bob")

# Now remove them
console::print(queue.remove())
console::print(queue.remove())
console::print(queue.remove())
end func
end module

Example with generics​

Now, suppose we wanted to also make a queue that stores actor objects. We would need to define a new class actor_queue and actor_queue_node, with essentially identical code. That is wasteful. Can we share the code?

One idea might be to make a base class queue_item, with child classes for storing actor and text. This would work, except that the remove() function would return an instance of base class. We would need to perform a type cast (result as actor or result as text) to recover the real item. Besides being awkward, it's wasteful to perform this runtime check when we know that the actor queue can only contain actor objects.

Generics solve this problem elegantly by introducing a type parameter t that can be either an actor or a string:

# "t" will be our type parameter
class queue_node<t>
var item: t
var next: queue_node<t>

constructor(item: t)
.item <- item
.next <- null
end constructor
end class

class queue<t>
var head: queue_node<t>
var tail: queue_node<t>

func is_empty(): bool
return .head = null
end func

func add(item: t)
var node: queue_node<t>
node <- new queue_node<t>(item)

if .tail = null then
.head <- node
.tail <- node
else
.tail.next <- node
.tail <- node
end if
end func

func remove(): t
if .head = null then
kernel::fail("queue is empty")
end if

var node: queue_node<t>
node <- .head

.head <- node.next
if .head = null then
.tail <- null
end if

return node.item
end func
end class

Example usage:

module main
func example()
# Here, "string" becomes the type argument for "t"
var queue: queue<string>
queue <- new queue<string>()

# Add 3 strings
queue.add("bill")
queue.add("jane")
queue.add("bob")

# Now remove them
console::print(queue.remove())
console::print(queue.remove())
console::print(queue.remove())

# Now let's try "actor" as the type argument for "t"
var actors: queue<actor>
actors <- new queue<actor>()

actors.add(new actor())
actors.add(new actor())
actors.add(new actor())
end func
end module

Important points:

  • Type parameters are specified in angle brackets (< and >). For example, class map<key, value> defines two parameters key and value.
  • In the above example, queue<actor> and queue<string> share a single implementation. Hybrix generics are modeled as type erasure, which means that type arguments have no effect on the emitted assembly language. If you examine the disassembly, you will find that queue.add("bill") and actors.add(new actor()) are jumping to the same queue.add() address.
  • Type parameters currently cannot have constraints. (In a future release, we will support simple constraints like class map<key, value extends actor>.)
  • Type arguments must always be pointer types such as classes and arrays. You cannot use a primitive type such as int, pair, or bool.

Guidance for primitive types​

If you need a primitive type such as int, pair, or bool, you must manually duplicate the code. This is why the Hybrix framework provides list_int alongside list<t>.

This restriction may seem a bit onerous. Other languages such as Rust perform "monomorphization", where the compiler automatically emits different implementations of the same class for each parameterization. For a small system like Hybrix, that could quickly bloat a program, in a way that is not obvious from looking at the source code. Requiring the engineer to manually copy and paste the code makes this cost more obvious, and usually suggests meaningful optimizations. Since Hybrix only has a few primitive types, in practice it's not too painful.

Question: Why can't int be an argument?

Although int has the same 32-bit storage as a pointer, its emitted assembly language is a bit different. For pointers, the compiler emits gpush and gpop instructions for garbage collector tracking.

Generic programming is a deep topic. Some languages such as TypeScript and C++ provide generic features so sophisticated as to be Turing complete, able to represent arbitrary calculations as a type check. Hybrix's generics sit at the other extreme, providing just enough abstraction to implement essential containers such as list<t> or queue<t>, while still ensuring that every line of source code can be easily matched to its emitted assembly language.