Step 06 · Allocation strategies
Heap::alloc is a thin wrapper over std::malloc. That's the
simplest correct allocator, but real runtimes layer specialised ones
on top because the alloc fast-path runs every few instructions.
Bump allocation
arena: [................................]
^ free
free += bytes
- Cost: one add, one compare. ~5 cycles.
- Used by: every copying GC (because the from-space is wiped each collection, the bump pointer resets).
Free lists
Maintain per-size buckets of freed objects:
size 16: → block → block → block → ...
size 32: → block → ...
Allocation: pop head of the right bucket. Sweep: push freed objects onto the right bucket. Pros: reuses fragmented space. Cons: slower than bump; harder to reason about live size.
Thread-Local Allocation Buffers (TLABs)
In a multi-threaded VM, every thread reserves a chunk of the global heap and bump-allocates inside it. No atomics on the fast path.
Thread 1 TLAB: [................xxxxxxxx] xxxx = live
Thread 2 TLAB: [............xxxxxxxxxxx]
Global heap: [TLAB 1 ][TLAB 2 ][...]
When a thread's TLAB is exhausted it locks the global heap, claims a new one, and continues. The lock-free fast path is critical to multi-threaded performance.
Generational allocation
nursery (small, fast): bump alloc → frequent minor GC
tenured (large, slow): mark-sweep or mark-compact → rare major GC
Most objects die young → minor GC is fast and cheap. Survivors get promoted to the tenured space.
This requires a write barrier: when an old object's field is set to a young object, record the cross-generation pointer so the minor GC can find young roots without scanning the entire old generation.
inline void writeBarrier(Object* parent, Value child) {
if (child.isObject() && parent->isOld() && child.asObject()->isYoung())
rememberedSet.add(parent);
}
cp-14's runtime doesn't implement any of these — but the linked-list
object table and explicit roots make it easy to swap in any of them
later. The right abstraction is "all alloc paths go through one
function" — which we have.