Chapter 3 Type System

Table of Contents
Overview
Intrinsic Types
Type Aliases
Backend Preludes
Literal Type Inference
References and Aliasing
Parameter Passing
Constructors
Closures (Lambdas)
Error Handling
Union Types
Generics
Lifetimes and Memory Regions

Overview

Scaly's type system is designed for maximum portability across backends. The compiler itself has no built-in knowledge of primitive types. Instead, all types — including integers, floats, and booleans — are defined in backend-specific prelude files.

Intrinsic Types

An intrinsic type is a type whose implementation is provided by the backend, not by Scaly code. Intrinsic types are declared using the intrinsic keyword:


define i32 intrinsic
define i64 intrinsic
define f64 intrinsic
    

The compiler does not interpret these definitions — it simply records that these types exist and are intrinsic. The backend (Emitter) is responsible for mapping intrinsic types to the target platform's representation.

This design enables Scaly to target any backend:

Type Aliases

Type aliases provide human-friendly names for intrinsic or compound types:


define bool i1
define char i32       ; Unicode scalar value
define int i64        ; Platform word size
define size_t u64
    

The alias and its target are interchangeable — bool and i1 refer to the same type.

Backend Preludes

Each backend provides a prelude file that defines the intrinsic types and their aliases for that target. The prelude is implicitly loaded before any user code.

Example 3-1. LLVM Prelude (excerpt)


; Intrinsic types (LLVM native)
define i1 intrinsic
define i8 intrinsic
define i16 intrinsic
define i32 intrinsic
define i64 intrinsic
define f32 intrinsic
define f64 intrinsic
define ptr intrinsic

; Human-friendly aliases
define bool i1
define char i32
define int i64
define size_t u64
      

Example 3-2. JavaScript Prelude (excerpt)


; Intrinsic types (JavaScript native)
define number intrinsic
define string intrinsic
define boolean intrinsic

; Aliases for compatibility
define int number
define float number
define bool boolean
      

Literal Type Inference

Numeric and other literals do not have an inherent type. Their type is inferred from context:


function double(x: i32) returns i32 { x * 2 }

double(42)           ; 42 inferred as i32 from parameter type

let y: i64 100       ; 100 inferred as i64 from annotation
let z y + 50         ; 50 inferred as i64 to match y
    

If the type cannot be inferred, the compiler requires an explicit annotation:


let x 42             ; ERROR: cannot infer type for integer literal
let x: i32 42        ; OK: type explicitly annotated
    

Scaly does not support type suffixes on literals (such as 42i32). This keeps literals clean and encourages explicit type annotations where they matter.

References and Aliasing

Scaly does not have reference types. There is no ref[T] or equivalent. This follows the ParaSail philosophy: no pointers or aliasing at the language level.

Values flow through functions. When a function receives a parameter, it gets read-only access (implemented as a pointer under the hood, but not exposed in the type system). When a function returns, it returns a value.

Low-Level Pointers

For low-level implementation of data structures like Page, HashMap, or List, the type pointer[T] is available. This is an escape hatch for implementors, not for everyday code.


; Low-level list node (internal implementation)
define Node[T] {
    data: T
    next: pointer[Node[T]]    ; raw pointer for linked structure
}
      

Nullable Values

For nullable values, use Option[T] — a proper sum type with Some(value) and None variants:


function find(list: List[T], predicate: function(T) returns bool) returns Option[T] {
    ; returns Some(item) if found, None otherwise
}

let result find(items, \x: x > 10)
choose result {
    when Some(value): process(value)
    when None: handle_not_found()
}
      

The compiler optimizes Option[T] to a simple nullable pointer — no space overhead for the tag.

Parameter Passing

All function parameters are borrowed — functions receive read-only access to their arguments. The caller retains ownership.


define Point { x: i32, y: i32 }

function distance(a: Point, b: Point) returns f64 {
    ; a and b are read-only views
    ; cannot modify them
    ...
}

let origin Point(0, 0)
let target Point(3, 4)
distance(origin, target)    ; origin and target unchanged
    

Implementation

The implementation of parameter passing depends on the execution context, but the semantics remain identical:

  • Same thread/stack: A pointer is passed. No copying occurs. The function reads through the pointer.

  • Different thread/GPU/remote: The entire data tree is copied to the target execution context. The function still has read-only access — same semantics, different mechanism.

This design means code doesn't change based on where it executes. A function that works locally works identically when distributed.

Mutation via Procedures

To modify data, use a procedure instead of a function. Procedures can declare parameters as mutable:


procedure move(p: mutable Point, dx: i32, dy: i32) {
    set p.x: p.x + dx
    set p.y: p.y + dy
}

var position Point(0, 0)
move(position, 5, 3)    ; position is now (5, 3)
      

The distinction between functions (pure, read-only) and procedures (may mutate) is explicit in the code. Readers immediately know which calls might have side effects.

Constructors

Constructors create instances of types. Scaly provides both implicit and explicit constructors.

Implicit Constructors

If all members of a type are public, an implicit constructor is generated that takes all fields as parameters in declaration order:


define Point { x: i32, y: i32 }

let p Point(10, 20)    ; implicit constructor
      

Explicit Constructors

Use init for explicit constructors when you need:

  • Private members (implicit constructor unavailable)

  • Default values for some fields

  • Different construction signatures


define Point
(
    x: i32
    y: i32
)
{
    init(value: i32) {      ; convenience constructor
        set this.x: value
        set this.y: value
    }
}

let p1 Point(10, 20)    ; first init
let p2 Point(5)         ; second init - Point(5, 5)
      

The this Prefix

The this. prefix is optional when unambiguous, but recommended for clarity:


init(x: i32, y: i32) {
    set this.x: x    ; clear: field x gets parameter x
    set this.y: y
}
      

When parameter names shadow field names, this. is required to disambiguate. Avoid set x: x — it's confusing even if technically resolvable.

Complete Initialization

All fields must be initialized by any constructor — implicit or explicit. The compiler enforces this. There are no implicit default values (no automatic 0, false, or null):


define Point { x: i32, y: i32 }

init(x: i32) {
    set this.x: x
    ; ERROR: field 'y' not initialized
}
      

Constructor Return

init implicitly returns this, enabling direct binding:


let p Point(10, 20)    ; init returns the new Point
      

Constructed Value Lifetime

The lifetime of a constructed value is inferred from context:

  • In a block: Local lifetime (current block)

  • Last statement / return position: Call lifetime (return page)

  • Explicit annotation: As specified


function example() returns Point {
    let temp Point(1, 2)     ; local - dies at block end
    return Point(3, 4)       ; call - inferred from return position
}
      

Page-Parameterized Constructors (init#)

Some types are value types (can live on the stack) but need to allocate internal data on a page. The init# syntax supports this pattern:


define String(data: pointer[char], length: size_t)
{
    ; init# takes an implicit page parameter as first argument
    init#(page, text: pointer[const_char])
    {
        let len strlen(text)
        set data: page.allocate(len + 1, 1) as pointer[char]
        memcpy(data, text, len)
        set length: len
    }
}
      

At call sites, the lifetime modifier determines which page is passed:

  • String#("hello") — passes caller's page (rp)

  • String$("world") — passes local page

  • String^hashMap("key") — passes the named page

The init# pattern is distinct from regular heap allocation. The struct itself can live on the stack — only its internal data needs a page. This is ideal for types like String, Array, or other containers where the wrapper is small but the content may be large.

Note: init$, init^, and init! are reserved for future use and will produce an error.

Closures (Lambdas)

Closures are anonymous functions defined with backslash syntax:


\x: x * 2                    ; single parameter
\x y: x + y                  ; multiple parameters
\: 42                        ; no parameters
    

Capturing Variables

Closures can capture variables from their enclosing scope. Captured variables are treated as implicit borrowed parameters — read-only access, same as explicit function parameters:


let multiplier 10
let scale \x: x * multiplier    ; captures multiplier (read-only)

scale(5)    ; returns 50
      

Closures Are Pure

Closures cannot mutate captured variables. They are pure like functions, not imperative like procedures:


var count 0
let bad \: { set count: count + 1 }   ; ERROR: cannot mutate capture
      

Closure Lifetime

Since captures are borrowed, a closure cannot outlive its captured variables. The compiler enforces this through lifetime checking:


function makeCounter() returns (function() returns i32) {
    var count 0
    return \: count    ; ERROR: closure outlives captured 'count'
}
      

Implementation

Each closure is an anonymous struct containing its captures, with a call method implementing the body. Closures are monomorphized like other generic types.

Error Handling

Scaly uses explicit error handling. There are no exceptions or stack unwinding — errors are values returned from functions or procedures.

The throws Clause

Functions or procedures that can fail declare their error type with throws:


procedure parse(input: String) returns AST throws ParseError {
    if invalid(input) {
        throw InvalidSyntax(position, "expected expression")
    }
    ...
}
      

Under the hood, this is equivalent to returning Result[AST, ParseError], but with dedicated syntax for clarity.

Single Error Type

A function can only throw one error type. To represent multiple error kinds, use a union:


define FileError union {
    NotFound { path: String }
    PermissionDenied { path: String }
    IoError { message: String }
}

function readFile(path: String) returns String throws FileError {
    ...
}
      

This ensures a clear error signature and enables the try/when pattern for handling specific variants.

The try/when Pattern

Handle errors with try and when clauses:


try let ast parse(input)
    when InvalidSyntax(pos, msg): reportError(pos, msg)
    when UnexpectedEof: reportError(0, "unexpected end of file")
      

If not all error variants are covered, an else clause is required:


try let ast parse(input)
    when InvalidSyntax(pos, msg): reportError(pos, msg)
    else panic("unhandled error")
      

Error Propagation

Use else throw to re-throw errors to the caller:


function process(input: String) returns Data throws ParseError {
    try let ast parse(input)
        else throw    ; re-throws ParseError to caller

    transform(ast)
}
      

When the error type matches exactly, the error propagates automatically. This simplifies deeply nested code like parsers and visitors.

Error Lifetime

Thrown values must have thrown lifetime (!). The compiler infers this from throw position. If inference fails, annotate explicitly:


throw ParseError.InvalidSyntax(pos, msg)!   ; explicit thrown lifetime

The caller provides the exception region where the error will be stored.

Union Types

A union type (also called sum type or tagged union) can hold one of several variants. Each variant can have its own fields:


define Shape union {
    Circle { radius: f64 }
    Rectangle { width: f64, height: f64 }
    Triangle { a: f64, b: f64, c: f64 }
}

let s Shape.Circle(5.0)
    

Memory Layout

A union is stored as a tag plus the largest variant:


Shape = { tag: u8, data: [size of largest variant] }
      

Unions can contain other unions. Nested unions contribute their full size (tag + data) when computing the parent union's size.

Pattern Matching with choose

Use choose-when to match on variants:


choose s
    when Circle(r): computeCircleArea(r)
    when Rectangle(w, h): w * h
    when Triangle(a, b, c): heronArea(a, b, c)

If not all variants are covered, an else clause is required:


choose s when Circle(r): computeCircleArea(r) else 0.0    ; handles Rectangle and Triangle

Option Type

Option[T] is a union for nullable values:


define Option[T] union {
    Some { value: T }
    None
}

function find(list: List[T], pred: function(T) returns bool) returns Option[T] {
    ...
}

choose find(items, \x: x > 10) {
    when Some(value): process(value)
    when None: handle_not_found()
}
      

Option Optimization

For non-pointer types, Option[T] optimizes to a pointer:

  • None = null pointer

  • Some(value) = pointer to value

Note that if T is itself a pointer, the Option Optimization leads to a pointer to another pointer which could be null itself.

Generics

Scaly supports generic types with type parameters in square brackets:


define List[T] {
    ...
}

define HashMap[K, V] {
    ...
}

let numbers List[i32]()
let names List[String]()
    

Monomorphization

Generics are implemented via monomorphization: each concrete instantiation becomes a completely separate type at compile time. List[i32] and List[String] share no code at runtime — each has its own specialized implementation.

Benefits:

  • No runtime overhead — no type descriptors or vtables

  • Full optimization — the compiler sees concrete types

  • No boxing — primitives stay primitives

Trade-offs:

  • Larger binaries — each instantiation duplicates code

  • Longer compile times — more code to generate

Name Mangling

The Planner generates unique mangled names for each instantiation, following Itanium ABI conventions:


List[i32]         → _ZN4ListIiE...
List[String]      → _ZN4ListI6StringE...
HashMap[String, i32] → _ZN7HashMapI6StringiE...
      

These names are compatible with c++filt for debugging.

Lifetimes and Memory Regions

Scaly uses Region-Based Memory Management (RBMM). Instead of garbage collection or manual malloc/free, values are allocated in memory regions (pages) that are deallocated in bulk when their owning scope exits.

Stack vs Page Allocation

The presence or absence of a lifetime suffix on a constructor call determines whether the value is stack-allocated or page-allocated:

No suffix = Stack allocation

The value is allocated on the stack frame. Fast, automatic cleanup when the function returns. Cannot outlive the current function.

Lifetime suffix = Page allocation

The value is allocated on a memory page. Can outlive the current block depending on which lifetime is used.


let stack_point Point(10, 20)     ; stack-allocated (by value)
let page_point Point$(10, 20)     ; page-allocated (local page)
      

Note: The lifetime suffix comes before the constructor parameters, not after. This follows the general pattern: Type + Generics + Lifetime + Parameters.

Lifetime Kinds

Local ($)

Value is allocated on a local page that lives until the end of the current block. The page is lazily allocated on first use and automatically deallocated when the block exits.

Caller (#)

Value is allocated on the caller's return page. It survives the function return and is managed by the caller. Used for returning heap-allocated values.

Thrown (!)

Value is allocated on the exception page. Used for error values that may be thrown and caught by the caller.

Reference (^name)

Value is allocated on the same page as the named variable. The named variable must itself be page-allocated (not stack-allocated). Used when adding values to collections.

Local Lifetime ($)

Local lifetime allocates on a page scoped to the current block. The page is lazily allocated (only when needed) and automatically freed when the block exits:


function process(input: String) {
    if condition {
        let parser Parser$(input)    ; allocated on local page
        parser.parse()
    }                                 ; page deallocated here

    ; parser and its page are gone
}
      

Important: Local lifetime ($) is forbidden on return types. A function cannot return a value with local lifetime because the local page is deallocated before the caller receives the value.


; ERROR: local lifetime ($) not allowed on return types
function bad() returns Point$ { ... }

; OK: use caller lifetime (#) or by-value return
function good() returns Point# { ... }
function also_good() returns Point { ... }
      

Caller Lifetime (#)

Values that must survive a function return use caller lifetime. The value is allocated on the caller's page, not the function's local page:


function createParser(input: String) returns pointer[Parser] {
    Parser#(input)    ; allocated on caller's page
}
      

The compiler may infer caller lifetime for values in return position when the return type specifies it.

Thrown Lifetime (!)

Error values use thrown lifetime. The value is allocated on a special exception page provided by the caller:


function parse(input: String) returns AST throws ParseError {
    if invalid(input) {
        throw ParseError("invalid syntax")!   ; thrown lifetime
    }
    ...
}
      

The compiler infers thrown lifetime for values in throw position.

Reference Lifetime (^name)

When adding a value to a collection, the value must be allocated on the same page as the collection. Use reference lifetime with the collection's name:


let items Array$()                ; items is page-allocated
let item Car^items("red")         ; item allocated on same page as items
items.add(item)                   ; safe - same lifetime
      

Validation: The compiler verifies that the referenced variable (items in this example) is page-allocated. Referencing a stack-allocated variable is an error:


let stack_car Car("blue")         ; stack-allocated (no suffix)
let other Car^stack_car("red")    ; ERROR: cannot use ^stack_car
                                  ; stack_car is not page-allocated
      

This prevents dangling references: you cannot tie a value's lifetime to a stack variable that will be destroyed when the function returns.

By-Value Returns

Functions can return values by value (no lifetime annotation). This is the simplest approach for small types:


function createPoint(x: i32, y: i32) returns Point {
    Point(x, y)    ; constructed and returned by value
}
      

By-value return avoids page allocation entirely. The value is constructed directly in the caller's stack frame or register.

Lifetime Inference

The compiler infers lifetimes where possible:

  • No suffix on constructor = stack allocation (by value)

  • Throw position implies thrown lifetime (!)

  • Reference lifetime (^container) must always be explicit

  • $, # must be explicit on constructor calls