Generics
🚧 This language feature is still experimental.
- The compiler may accept invalid generic code without reporting errors.
- The compiler may report errors for valid generic code.
Full validation will be implemented in a future release.
Hybrix functions accept parameters that range over data values. Generic programming allows a class to have type parameters which range over data types. The table below contrasts these two worlds:
| Function parameters | Type parameters | |
|---|---|---|
| Example declaration | func f(x: int) | class list<t> |
| Parameter ranges over | data values | data types |
| Example argument | 123 | string |
| What they are used for | determining program behavior | checking program correctness |
| When that happens | runtime | compile time |
Example without generics​
A queue is a common data structure where elements are added (sometimes called "enqueued") and then removed ("dequeued"), with a first-in-first-out ordering. Here's an example queue that stores text string values:
class queue_node
var item: string
var next: queue_node
constructor(item: string)
.item <- item
.next <- null
end constructor
end class
class queue
var head: queue_node
var tail: queue_node
func is_empty(): bool
return .head = null
end func
func add(item: string)
var node: queue_node
node <- new queue_node(item)
if .tail = null then
.head <- node
.tail <- node
else
.tail.next <- node
.tail <- node
end if
end func
func remove(): string
if .head = null then
kernel::fail("queue is empty")
end if
var node: queue_node
node <- .head
.head <- node.next
if .head = null then
.tail <- null
end if
return node.item
end func
end class
Example usage:
module main
func example()
var queue: queue
queue <- new queue()
# Add 3 strings
queue.add("bill")
queue.add("jane")
queue.add("bob")
# Now remove them
console::print(queue.remove())
console::print(queue.remove())
console::print(queue.remove())
end func
end module
Example with generics​
Now, suppose we wanted to also make a queue that stores actor objects. We would need to define a new class actor_queue and actor_queue_node, with essentially identical code. That is wasteful. Can we share the code?
One idea might be to make a base class queue_item, with child classes for storing actor and text. This would work, except that the remove() function would return an instance of base class. We would need to perform a type cast (result as actor or result as text) to recover the real item. Besides being awkward, it's wasteful to perform this runtime check when we know that the actor queue can only contain actor objects.
Generics solve this problem elegantly by introducing a type parameter t that can be either an actor or a string:
# "t" will be our type parameter
class queue_node<t>
var item: t
var next: queue_node<t>
constructor(item: t)
.item <- item
.next <- null
end constructor
end class
class queue<t>
var head: queue_node<t>
var tail: queue_node<t>
func is_empty(): bool
return .head = null
end func
func add(item: t)
var node: queue_node<t>
node <- new queue_node<t>(item)
if .tail = null then
.head <- node
.tail <- node
else
.tail.next <- node
.tail <- node
end if
end func
func remove(): t
if .head = null then
kernel::fail("queue is empty")
end if
var node: queue_node<t>
node <- .head
.head <- node.next
if .head = null then
.tail <- null
end if
return node.item
end func
end class
Example usage:
module main
func example()
# Here, "string" becomes the type argument for "t"
var queue: queue<string>
queue <- new queue<string>()
# Add 3 strings
queue.add("bill")
queue.add("jane")
queue.add("bob")
# Now remove them
console::print(queue.remove())
console::print(queue.remove())
console::print(queue.remove())
# Now let's try "actor" as the type argument for "t"
var actors: queue<actor>
actors <- new queue<actor>()
actors.add(new actor())
actors.add(new actor())
actors.add(new actor())
end func
end module
Important points:
- Type parameters are specified in angle brackets (
<and>). For example,class map<key, value>defines two parameterskeyandvalue. - In the above example,
queue<actor>andqueue<string>share a single implementation. Hybrix generics are modeled as type erasure, which means that type arguments have no effect on the emitted assembly language. If you examine the disassembly, you will find thatqueue.add("bill")andactors.add(new actor())are jumping to the samequeue.add()address. - Type parameters currently cannot have constraints. (In a future release, we will support simple constraints like
class map<key, value extends actor>.) - Type arguments must always be pointer types such as classes and arrays. You cannot use a primitive type such as
int,pair, orbool.
Guidance for primitive types​
If you need a primitive type such as int, pair, or bool, you must manually duplicate the code. This is why the Hybrix framework provides list_int alongside list<t>.
This restriction may seem a bit onerous. Other languages such as Rust perform "monomorphization", where the compiler automatically emits different implementations of the same class for each parameterization. For a small system like Hybrix, that could quickly bloat a program, in a way that is not obvious from looking at the source code. Requiring the engineer to manually copy and paste the code makes this cost more obvious, and usually suggests meaningful optimizations. Since Hybrix only has a few primitive types, in practice it's not too painful.
Question: Why can't
intbe an argument?Although
inthas the same 32-bit storage as a pointer, its emitted assembly language is a bit different. For pointers, the compiler emitsgpushandgpopinstructions for garbage collector tracking.
Generic programming is a deep topic. Some languages such as TypeScript and C++ provide generic features so sophisticated as to be Turing complete, able to represent arbitrary calculations as a type check. Hybrix's generics sit at the other extreme, providing just enough abstraction to implement essential containers such as list<t> or queue<t>, while still ensuring that every line of source code can be easily matched to its emitted assembly language.