Table of Contents
Scaly uses region-based memory management where objects conceptually live on the stack. No heap is used, and therefore, neither garbage collection nor reference counting is required. Objects are accessed via references within well-defined constraints which ensure safe parallel execution of the code.
With Scaly’s region-based memory management, objects are not allocated from a global heap, but from a region. A region is formed each time program execution enters a block in which at least one fresh object is assigned to a reference which is defined in that block. An example:
function f() { let a = new A() let b = createB() { let c = new C() { c.d = new D() } } }
When a thread enters f
, the outermost block of that function allocates a region in which
two objects are created: one object of type A
to which the reference a
points is created with
new
, and then one object of type B
to which the reference b
points, is created by a function
createB()
(which might create the object by itself using new
or call another function
which creates the object).
Then, the next block is entered, and a new region is allocated in which an object of type C
is created and then assigned to the reference c
.
The innermost block of the function, however, creates no region because no object is assigned to a
reference which is defined in that block. The new D
object is created in C
's region instead
since it is assigned to d
which is a member of d
.
When the block which contains c
is left, its region is recycled, and both the object to which c
points and the object to which its member reference d
points cease to exist.
When the thread leaves f
, the region which was created by its outermost block is recycled, too,
and the objects to which a
and b
pointed, vanish as well.
References like a
, b
, and c
which are defined in a block are called root references.
A root reference cannot be returned by a function because the region in which the object lives
to which that reference points is already recycled since its block is left
before the function returns.
References like d
are called local references. A local reference
points to an object which lives in the region of a containing object as a member, array element,
or dictionary key or value. A local reference can only be returned from a function if it is
contained in an object which was created before the function was entered.
The bootstrapping compiler which is currently the only (known) implementation of Scaly is written
in C++. It uses a special form of the C++ new
operator which is called placement new
to create new objects in the appropriate region. The runtime library which comes with the compiler
provides the necessary mechanisms for managing the memory in regions.
This compiler would compile the above function to a C++ function to something like the following:
void f() { _Region _region; _Page* _p = _region.get(); A* a = new(_p) A(); B* b = createB(_p); { _Region _region; _Page* _p = _region.get(); C* c = new(_p) C(); { c->d = new(c->getPage()) D(); } } }
When the function is entered, a _Region
object is created for its block to provide memory for
allocating objects which are rooted in this block. The memory of a region is divided in pages
of constant size (typically 4 or 8 kB). Therefore, a _Page
object _p
is obtained
from that region. Then a new object of type A
is created in _p
using the placement
variant of new
.
All classes implement the placement new operator by getting memory from the page in a way similar to this code (some error handling removed for brevity):
void* Object::operator new(size_t size, _Page* page) { return page->allocateObject(size); }
Next, the createB
function is used to create a new B
object in our current page _p
.
Since createB
needs to know in which page the object is to be created,
it is passed to that function. The function createB
can use placement new to create the object
or use another function, passing the pointer to the page to it.
Then, the next block is entered, and a new region is created and a page is obtained from that page
which is used to create a new C
object.
In the innermost block, a D
object is created in the page of c
. For that reason, c
is asked
to provide its page for creating the object in. All classes implement a function _getPage()
for
this purpose:
_Page* Object::_getPage() { return _Page::getPage(this); }
Since pages are always aligned and have a size of a multiple of 2, the page of an address can be calculated easily by setting the relevant lower bits of the address to zero:
_Page* _Page::getPage(void* address) { return (_Page*) (((intptr_t)address) & ~(intptr_t)(_pageSize - 1)); }
We have seen that each region provides a page upon block entry. The _Region
class allocates its
page out of a large chunk of memory which is thread local:
__thread _Page* __CurrentPage = 0;
At thread start, before regions kick in, this memory is allocated:
posix_memalign((void**)&__CurrentPage, _pageSize, _pageSize * _maxStackPages);
The _Region
class itself actually is only a very small helper.
At block entry, it shifts the __CurrentPage
pointer up and initializes a _Page
object
at this position:
_Region::_Region(){ __CurrentPage = (_Page*)(((char*)__CurrentPage) + _pageSize); __CurrentPage->reset(); }
At block exit, the extension Pages and exclusive Pages of the page (we will come to this later)
are deallocated, and the __CurrentPage
pointer is shifted down:
_Region::~_Region() { __CurrentPage->deallocateExtensions(); __CurrentPage = (_Page*)(((char*)__CurrentPage) - _pageSize); }
This way, the memory area accessed by the __CurrentPage
pointer during thread execution
forms a logical stack, and the pages which live in this area are called root pages
(as opposed to extension pages which we will cover in a later section of this chapter).
Allocating a fresh region is done by shifting a thread-local pointer.
As we have seen, the _Region
class itself only shifts the current root page pointer up and
down the root page stack as thread execution enters and leaves blocks which declare root references.
Most of the memory management logic is performed by the _Page
class. This class provides memory
for allocating objects and, if required, extends the available memory by allocating a new page
if its own memory is exhausted.
A page is a memory area which is aligned to multiples of the page size which is usually 4 or 8 kB.
This memory area is controlled by a _Page
object which lives at the start address of the page
and contains (among a few other things) an offset to the next object to be allocated,
counted from the page start address:
int nextObjectOffset;
If the space fits for a particular object to allocate, allocation is done by adding to
nextObjectOffset
and returning the current location. Here a code snippet of the
allocateObject
method of _Page
which is used by the placement new operator of Object
:
void* location = getNextObject(); void* nextLocation = align((char*)location + size); if (nextLocation <= (void*) getNextExclusivePageLocation()) { setNextObject(nextLocation); return location; }
Here are two of the three helpers used (getNextExclusivePageLocation()
will be explained later):
void* _Page::getNextObject() { return (char*)this + nextObjectOffset; }
void _Page::setNextObject(void* object) { nextObjectOffset = (char*)object - (char*)this; }
With this code, allocating an object is done taking the following steps:
location
of the next object to be allocated is calculated by adding nextObjectOffset
to the start of the page
location
and aligning the result
nextObjectOffset
is calculated, and location
is returned as the new object address.
Thus, object allocation typically boils down to adding to an offset and checking it.
An extension page is allocated and used if the current page cannot provide enough memory
to allocate an object of the desired size. If the size required exceeds the maximum size
which can be stored in a fresh page, an oversized page is allocated directly from the operating
system via posix_memalign
and is registered with the current page as an exclusive page. We will discuss exclusive pages in one
of the following sections. If the size required does fit in a fresh page,
an extension page is allocated by the following method of the _Page
class:
_Page* _Page::allocateExtensionPage() { _Page* extensionPage = __CurrentTask->getExtensionPage(); if (!extensionPage) return 0; extensionPage->reset(); *getExtensionPageLocation() = extensionPage; currentPage = extensionPage; return extensionPage; }
The getExtensionPage()
method of the _Task
class which is called first, allocates a page out of
a thread-local page pool whose start-up size is 4096 pages (one chunk) and which can extend itself
by adding more chunks. The pages in a chunk are bit mapped in an allocation map of that chunk, and
allocating a page in a chunk is a quick constant-time action.
Then, after the extension page is initialized with placement new, its address is stored at a memory location at the very end of the page to be accessed at deallocation time:
_Page** _Page::getExtensionPageLocation() { return ((_Page**) ((char*) this + _pageSize)) - 1; }
Before allocateExtensionPage()
returns, the extension page location is stored also in the
currentPage
member as a shortcut to a page where allocation space should be available.
This member is updated also if a further allocation attempt results in an object which lives in
an even newer extension page.
This way, allocation can start from a root page forming a chain of extension pages if many objects
are allocated in the region of that root page. When this region is left,
the deallocateExtensions()
method deallocates all extension pages
by freeing them in the thread local page pool.
We have seen that a region can grow by extending its root page with an extension page, which can
in turn be extended by another extension page, and so on. When thread execution leaves the region,
all extension pages are deallocated from the page pool. But while our region is alive,
we have seen no way for deleting objects from a region — but that is what we need
for mutable
object data, because otherwise we would leak memory by abandoning objects
pointed to by mutable
references if we set those references to fresh objects.
For this reason, fresh objects which are assigned to mutable
references, get their own (root)
page, a so-called exclusive page. An exclusive page can be easily deallocated
if its leading object is deleted while its region remains active.
Here are the relevant code lines which allocate and register an exclusive page:
_Page* exclusivePage = __CurrentTask->getExtensionPage(); exclusivePage->reset(); *getNextExclusivePageLocation() = exclusivePage; exclusivePages++;
The number of active exclusive pages is stored in the int
member exclusivePages
of the page,
and the pointer to the next exclusive page address is calculated by the following helper method:
_Page** _Page::getNextExclusivePageLocation() { return getExtensionPageLocation() - exclusivePages - 1; }
Exclusive pages are allocated and initialized in the same way as an extension page, but they are registered not at the very end of the current page but in the next lower addresses. Several exclusive pages can be allocated and registered with a page, while their locations are stored in an area which grows downward from the end of that page (while the space allocated by objects grows from the start of the page).
If an exclusive page is deallocated, it is unregistered with the page holding it
by shifting all lower exclusive page addresses up if applicable and decrementing exclusivePages
.
String and array buffers can have sizes that exceed the net space of a freshly allocated page. If we are to store such a thing, we allocate the necessary space directly from the operating system:
_Page* page; posix_memalign((void**)&page, _pageSize, size + sizeof(_Page)); page->reset(); page->currentPage = nullptr; *getNextExclusivePageLocation() = page; exclusivePages++; return ((char*)page) + sizeof(_Page);
Thus, we initialize the memory area as a page, mark it as oversized by setting currentPage
to nullptr
, register the page as an exclusive page and return the memory address
directly after the page object data.