Data types
Computer programs can be thought of having two parts:
- Code tells the computer what to do, what operations to perform. Generally code does not change while your program is running, just like how the recipe for a cake does not change while we are baking the cake.
- Data represents the information that gets processed by the code. The data may change while your program is running. For example, every time a video game character collects a coin, the
COUNTof coins will increase.
Here is some code that calculates the area of a rectangle, given its width and height:
MODULE RECTANGLE
FUNC GET_AREA(WIDTH: INT, HEIGHT: INT): INT
VAR AREA: INT
WIDTH * HEIGHT -> AREA
RETURN AREA
END FUNC
END MODULE
This code tells the computer to multiply the WIDTH times HEIGHT to find the answer.
The data is stored in the variables WIDTH, HEIGHT, and AREA. Data is stored as bytes in the computer's memory. The code declares the variables that access the data (VAR AREA: INT), but that is not the data. To see what's in a variable, you can use the Hybrix debugger to inspect it while your program is running.
Tip: To become a great software engineer, you must learn to picture data easily in your mind without needing a debugger. Often, understanding data is more important than understanding code: if you understand how the streets connect in your city, you can easily write instructions to drive to any destination.
List of Hybrix types
Data has different forms, called data structures. In the above example, all of the variables are 32-bit integers. Hybrix calls this data type INT. Hybrix supports other sizes of integers, as well as text ("strings"), and even custom data structures that you can define (such as the MONSTER class below). The custom structures are made from a small set of basic types, sometimes called primitive types.
BYTE (one byte, unsigned)
The computer's memory is a big list of bytes. In the computer's electronics, a byte has eight switches that can be on or off. Hybrix's BYTE gives us access to one byte, which we interpret to be a number from 0 to 255, because . The values must be whole numbers (1, 2, 3, and so forth), never fractions like or 3.14 or -1. "Unsigned" means the value cannot be negative.
Other interpretations are possible, for example we could have defined BYTE to be a number from -128 to 127, or it could have been eight individual numbers that is each 0 or 1. But it is not. Hybrix's interpretation becomes clear when performing operations on a BYTE. For example, if B is a BYTE, you will find that B < 0 is never true, thus BYTE does not behave like a negative number.
The decimal numbers 0 to 255 are written as $00 to $FF in hexadecimal notation.
PAIR (two bytes, signed)
This data type is made from two bytes, which we interpret to be an integer from -32,768 to 32,767, because (two's complement). "Signed" means the value may be negative.
In hexadecimal, a PAIR is written with four digits $0000 to $FFFF. Note that the PAIR -1 is usually written in hexadecimal as $FFFF (not -$0001), and -32,767 is usually written as $8001 (not -$7FFF). This is because hexadecimal notation is generally used when we care about the machine representation of bytes, whereas decimal is generally used for math operations.
The Chombit CPU also uses the terminology of
BYTE,PAIR, andINT; however the principle thatBYTEandTRIOare unsigned whereasPAIRandINTare signed does not apply to Chombit. CPU registers simply store bits. It is CPU instructions that determine whether those bits are interpreted as signed or unsigned numbers. For example,ANDis unsigned, whereasMULTIPLYis signed, andADDactually calculates both (representing signed and unsigned outcomes using separate CPU flags).
TRIO (three bytes, unsigned)
This data type is made from three bytes, which we interpret to be an integer from -8,388,608 – 8,388,607, because (two's complement).
Because Chombit does not provide a 3-byte CPU register, this data type cannot be used for regular program variables. TRIO is mainly used when discussing 24-bit memory addresses, as well as special situations where memory is saved by discarding the fourth byte of a memory address.
In hexadecimal, a TRIO is written with six digits $000000 to $FFFFFF. You can use _ to make it more readable, for example $12_3456 instead of $123456.
INT (four bytes, signed)
This data type is made from four bytes, which we interpret to be an integer from -2,147,483,648 to 2,147,483,647, because (two's complement).
The Chombit microprocessor is a 32-bit CPU, which means that it reads and writes chunks of data up to 4 bytes in size. In other words,
INTis the largest number that we can work with easily.INTis the "native" integer size for a 32-bit CPU.
BOOL
This type has only two possible values, TRUE or FALSE. It is formally called a "boolean" value, named after the English logician George Boole. Hybrix stores BOOL using a single byte, with 1 meaning TRUE, and 0 meaning FALSE. This can be considered wasteful, because BOOL is really a single bit (on/off switch), whereas a byte has eight bits. Computer memory is organized into bytes not bits, therefore your program would run more slowly when trying to access multiple BOOL values that were packed into a single byte. Thus, the design wastes some memory to make your program run faster. If your program needed to store a huge number of boolean values, then you could indeed pack them into bytes.
TYPE aliases
We mentioned that you can also make your own types. The simplest way is to give a new name for an existing type, by writing TYPE <ALIAS> IS <EXISTING TYPE>. For example, our GET_AREA() function above does not clearly indicate the units of measurement. Let's define two new types called INCHES and SQUARE_INCHES:
# MAKE A NEW TYPE CALLED "INCHES" FOR MEASURING DISTANCES:
TYPE INCHES IS INT
# ...BUT AREA IS MEASURED USING SQUARE INCHES, NOT REGULAR INCHES:
TYPE SQUARE_INCHES IS INT
MODULE RECTANGLE
FUNC GET_AREA(WIDTH: INCHES, HEIGHT: INCHES): SQUARE_INCHES
VAR AREA: SQUARE_INCHES
WIDTH * HEIGHT -> AREA
RETURN AREA
END FUNC
END MODULE
The variables WIDTH, HEIGHT, and AREA are still stored using INT types.
The [KERNEL] framework file uses TYPE to declare a "string" to be an array of bytes:
TYPE STRING IS BYTE[]
Arrays
An "array" is a data type that can hold a list of zero or more elements. In our STRING example above, adding [] to BYTE makes BYTE[], which is a list of bytes. For a string, each byte will store one letter of the word.
To learn more about arrays, see the Arrays section.
Classes
We mentioned that you can define your own data structures. This is done by defining a CLASS, for example:
CLASS MONSTER
# THE MONSTER'S NAME IS A STRING
VAR NAME: STRING # ...WHICH IS ANOTHER NAME FOR BYTE[]
# THE MONSTER'S (X,Y) COORDINATES ON THE GAME MAP
VAR LOCATION_X: INT
VAR LOCATION_Y: INT
# THE MONSTER'S HEALTH
VAR HEALTH: PAIR
# THE MONSTER CAN BE FRIENDS WITH OTHER MONSTERS, STORED IN THIS ARRAY
VAR FRIENDS: MONSTER[]
END CLASS
We have defined a custom type called MONSTER, made up of 5 smaller data values called "member variables." Note that the FRIENDS variable uses the MONSTER type with [] to make an array of other monsters.
To learn more about classes, see the Classes section.
Function pointers
(This is an advanced topic.) Later we will learn about functions, which are code (instructions of what to do) not data (information to be processed). In Hybrix, there is a special kind of variable called a function pointer. It does not store the code of the function, just a 4-byte "pointer" that identifies which function to call.
Function pointer types look like FUNC() (no parameters or return value) or FUNC(X: INT, Y: INT): INT (receives two parameters and returns an INT). Here is a simple example:
# DECLARE "ACTION" TO BE ANY FUNCTION WITH NO PARAMETERS OR RETURN VALUE
TYPE ACTION IS FUNC()
Here is an example of how we can use our new ACTION type:
MODULE MAIN
FUNC DO_THREE_TIMES(A: ACTION)
# BECAUSE "A" POINTS TO "WRITE_HELLO",
# THIS WILL WRITE HELLO 3 TIMES:
A()
A()
A()
END FUNC
FUNC WRITE_HELLO()
CONSOLE::PRINT("HELLO{N}")
END FUNC
FUNC START()
# THE EXTRA PARENTHESES ARE REQUIRED TO SHOW THAT YOU REALLY
# INTEND TO MAKE A POINTER, NOT TO CALL THE FUNCTION.
MAIN::DO_THREE_TIMES( (MAIN::WRITE_HELLO) )
END FUNC
END MODULE
Note: To keep things simple, the
FUNC()notation can only be used to makeTYPEaliases. For example, you are not allowed to writeFUNC DO_THREE_TIMES(A: FUNC())directly.
Making function pointers:
-
a pointer to a member function of a module:
( MY_MODULE::MY_FUNC ) -
a pointer to a member function of a class:
( MY_CLASS::MY_FUNC )Note that we write
( MY_CLASS::MY_FUNC )not( MY_OBJECT.MY_FUNC )—Hybrix function pointers do not include theSELFinformation. -
a pointer to a class constructor:
( NEW MY_CLASS ) -
a pointer to an array constructor:
( NEW MY_CLASS[] )