Chapter 8. Memory Management

Table of Contents

Regions
Pages
Root Page Allocation
Object Allocation
Extension pages
Exclusive pages
Oversized pages

Scaly uses region-based memory management where objects conceptually live on the stack. No heap is used, and therefore, neither garbage collection nor reference counting is required. Objects are accessed via references within well-defined constraints which ensure safe parallel execution of the code.

Regions

With Scaly’s region-based memory management, objects are not allocated from a global heap, but from a region. A region is formed each time program execution enters a block in which at least one fresh object is assigned to a reference which is defined in that block. An example:

function f() {
    let a = new A()
    let b = createB()
    {
        let c = new C()
        {
            c.d = new D()
        }
    }
}

When a thread enters f, the outermost block of that function allocates a region in which two objects are created: one object of type A to which the reference a points is created with new, and then one object of type B to which the reference b points, is created by a function createB() (which might create the object by itself using new or call another function which creates the object).

Then, the next block is entered, and a new region is allocated in which an object of type C is created and then assigned to the reference c.

The innermost block of the function, however, creates no region because no object is assigned to a reference which is defined in that block. The new D object is created in C's region instead since it is assigned to d which is a member of d.

When the block which contains c is left, its region is recycled, and both the object to which c points and the object to which its member reference d points cease to exist.

When the thread leaves f, the region which was created by its outermost block is recycled, too, and the objects to which a and b pointed, vanish as well.

References like a, b, and c which are defined in a block are called root references. A root reference cannot be returned by a function because the region in which the object lives to which that reference points is already recycled since its block is left before the function returns.

References like d are called local references. A local reference points to an object which lives in the region of a containing object as a member, array element, or dictionary key or value. A local reference can only be returned from a function if it is contained in an object which was created before the function was entered.

The bootstrapping compiler which is currently the only (known) implementation of Scaly is written in C++. It uses a special form of the C++ new operator which is called placement new to create new objects in the appropriate region. The runtime library which comes with the compiler provides the necessary mechanisms for managing the memory in regions.

Pages

This compiler would compile the above function to a C++ function to something like the following:

void f() {
    _Region _region; _Page* _p = _region.get();
    A* a = new(_p) A();
    B* b = createB(_p);
    {
        _Region _region; _Page* _p = _region.get();
        C* c = new(_p) C();
        {
            c->d = new(c->getPage()) D();
        }
    }
}

When the function is entered, a _Region object is created for its block to provide memory for allocating objects which are rooted in this block. The memory of a region is divided in pages of constant size (typically 4 or 8 kB). Therefore, a _Page object _p is obtained from that region. Then a new object of type A is created in _p using the placement variant of new.

All classes implement the placement new operator by getting memory from the page in a way similar to this code (some error handling removed for brevity):

void* Object::operator new(size_t size, _Page* page) {
    return page->allocateObject(size); }

Next, the createB function is used to create a new B object in our current page _p. Since createB needs to know in which page the object is to be created, it is passed to that function. The function createB can use placement new to create the object or use another function, passing the pointer to the page to it.

Then, the next block is entered, and a new region is created and a page is obtained from that page which is used to create a new C object.

In the innermost block, a D object is created in the page of c. For that reason, c is asked to provide its page for creating the object in. All classes implement a function _getPage() for this purpose:

_Page* Object::_getPage() {
    return _Page::getPage(this);
}

Since pages are always aligned and have a size of a multiple of 2, the page of an address can be calculated easily by setting the relevant lower bits of the address to zero:

_Page* _Page::getPage(void* address) {
    return (_Page*) (((intptr_t)address) & ~(intptr_t)(_pageSize - 1));
}

Root Page Allocation

We have seen that each region provides a page upon block entry. The _Region class allocates its page out of a large chunk of memory which is thread local:

__thread _Page* __CurrentPage = 0;

At thread start, before regions kick in, this memory is allocated:

posix_memalign((void**)&__CurrentPage, _pageSize, _pageSize * _maxStackPages);

The _Region class itself actually is only a very small helper. At block entry, it shifts the __CurrentPage pointer up and initializes a _Page object at this position:

_Region::_Region(){
    __CurrentPage = (_Page*)(((char*)__CurrentPage) + _pageSize);
    __CurrentPage->reset();
}

At block exit, the extension Pages and exclusive Pages of the page (we will come to this later) are deallocated, and the __CurrentPage pointer is shifted down:

_Region::~_Region() {
    __CurrentPage->deallocateExtensions();
    __CurrentPage = (_Page*)(((char*)__CurrentPage) - _pageSize);
}

This way, the memory area accessed by the __CurrentPage pointer during thread execution forms a logical stack, and the pages which live in this area are called root pages (as opposed to extension pages which we will cover in a later section of this chapter). Allocating a fresh region is done by shifting a thread-local pointer.

Object Allocation

As we have seen, the _Region class itself only shifts the current root page pointer up and down the root page stack as thread execution enters and leaves blocks which declare root references.

Most of the memory management logic is performed by the _Page class. This class provides memory for allocating objects and, if required, extends the available memory by allocating a new page if its own memory is exhausted.

A page is a memory area which is aligned to multiples of the page size which is usually 4 or 8 kB. This memory area is controlled by a _Page object which lives at the start address of the page and contains (among a few other things) an offset to the next object to be allocated, counted from the page start address:

int nextObjectOffset;

If the space fits for a particular object to allocate, allocation is done by adding to nextObjectOffset and returning the current location. Here a code snippet of the allocateObject method of _Page which is used by the placement new operator of Object:

void* location = getNextObject();
void* nextLocation = align((char*)location + size);
if (nextLocation <= (void*) getNextExclusivePageLocation()) {
    setNextObject(nextLocation);
    return location; }

Here are two of the three helpers used (getNextExclusivePageLocation() will be explained later):

void* _Page::getNextObject() {
    return (char*)this + nextObjectOffset; }
void _Page::setNextObject(void* object) {
    nextObjectOffset = (char*)object - (char*)this; }

With this code, allocating an object is done taking the following steps:

  1. The location of the next object to be allocated is calculated by adding nextObjectOffset to the start of the page
  2. The upper boundary of the object is calculated by adding the object size to location and aligning the result
  3. If the upper boundary is within limits, nextObjectOffset is calculated, and location is returned as the new object address.

Thus, object allocation typically boils down to adding to an offset and checking it.

Extension pages

An extension page is allocated and used if the current page cannot provide enough memory to allocate an object of the desired size. If the size required exceeds the maximum size which can be stored in a fresh page, an oversized page is allocated directly from the operating system via posix_memalign and is registered with the current page as an exclusive page. We will discuss exclusive pages in one of the following sections. If the size required does fit in a fresh page, an extension page is allocated by the following method of the _Page class:

_Page* _Page::allocateExtensionPage() {
    _Page* extensionPage = __CurrentTask->getExtensionPage();
    if (!extensionPage)
        return 0;
    extensionPage->reset();
    *getExtensionPageLocation() = extensionPage;
    currentPage = extensionPage;
    return extensionPage;
}

The getExtensionPage() method of the _Task class which is called first, allocates a page out of a thread-local page pool whose start-up size is 4096 pages (one chunk) and which can extend itself by adding more chunks. The pages in a chunk are bit mapped in an allocation map of that chunk, and allocating a page in a chunk is a quick constant-time action.

Then, after the extension page is initialized with placement new, its address is stored at a memory location at the very end of the page to be accessed at deallocation time:

_Page** _Page::getExtensionPageLocation() {
    return ((_Page**) ((char*) this + _pageSize)) - 1; }

Before allocateExtensionPage() returns, the extension page location is stored also in the currentPage member as a shortcut to a page where allocation space should be available. This member is updated also if a further allocation attempt results in an object which lives in an even newer extension page.

This way, allocation can start from a root page forming a chain of extension pages if many objects are allocated in the region of that root page. When this region is left, the deallocateExtensions() method deallocates all extension pages by freeing them in the thread local page pool.

Exclusive pages

We have seen that a region can grow by extending its root page with an extension page, which can in turn be extended by another extension page, and so on. When thread execution leaves the region, all extension pages are deallocated from the page pool. But while our region is alive, we have seen no way for deleting objects from a region — but that is what we need for mutable object data, because otherwise we would leak memory by abandoning objects pointed to by mutable references if we set those references to fresh objects.

For this reason, fresh objects which are assigned to mutable references, get their own (root) page, a so-called exclusive page. An exclusive page can be easily deallocated if its leading object is deleted while its region remains active.

Here are the relevant code lines which allocate and register an exclusive page:

_Page* exclusivePage = __CurrentTask->getExtensionPage();
exclusivePage->reset();
*getNextExclusivePageLocation() = exclusivePage;
exclusivePages++;

The number of active exclusive pages is stored in the int member exclusivePages of the page, and the pointer to the next exclusive page address is calculated by the following helper method:

_Page** _Page::getNextExclusivePageLocation() {
    return getExtensionPageLocation() - exclusivePages - 1; }

Exclusive pages are allocated and initialized in the same way as an extension page, but they are registered not at the very end of the current page but in the next lower addresses. Several exclusive pages can be allocated and registered with a page, while their locations are stored in an area which grows downward from the end of that page (while the space allocated by objects grows from the start of the page).

If an exclusive page is deallocated, it is unregistered with the page holding it by shifting all lower exclusive page addresses up if applicable and decrementing exclusivePages.

Oversized pages

String and array buffers can have sizes that exceed the net space of a freshly allocated page. If we are to store such a thing, we allocate the necessary space directly from the operating system:

_Page* page;
posix_memalign((void**)&page, _pageSize, size + sizeof(_Page));
page->reset();
page->currentPage = nullptr;
*getNextExclusivePageLocation() = page;
exclusivePages++;
return ((char*)page) + sizeof(_Page);

Thus, we initialize the memory area as a page, mark it as oversized by setting currentPage to nullptr, register the page as an exclusive page and return the memory address directly after the page object data.