Step 07 · Beyond mark-sweep
The collector zoo, roughly in order of complexity:
Reference counting
Every object keeps a counter; increment on assign, decrement on overwrite/scope exit. Free when count hits 0.
- Pros: deterministic — finalisers run promptly. Memory footprint predictable.
- Cons: cycles leak (need a backup tracing GC). Atomic refcounts in multi-threaded code are expensive.
- Users: CPython (with cycle-detector backup), Swift, Rust's
Rc<T>.
Mark-sweep (cp-14)
What we built. Simple, non-relocating, fragmenting.
Mark-compact
After mark, slide survivors to one end of the heap (or use a forwarding-pointer Cheney pass). Eliminates fragmentation. Pointers need updating — easier with precise stack maps; impossible with conservative scanning.
Copying (Cheney)
Two equally-sized spaces: from-space and to-space. Allocation bump- allocates in from-space. On GC, walk roots and copy live objects into to-space, leaving forwarding pointers in from-space. Flip spaces. From-space is now garbage; bump pointer resets.
- Pros: O(live) work, no fragmentation, blazing-fast alloc.
- Cons: half the heap is always wasted; relocates objects.
- Users: most young generations (because young set is small).
Generational
Combine: copying for the nursery, mark-compact for the tenured. The Hotspot, V8, and SpiderMonkey GCs are all generational variants of this idea.
Concurrent / incremental
Run mark and/or sweep on a separate thread concurrently with the mutator. Needs:
- Write barriers (Yuasa/Dijkstra) so the marker doesn't lose objects that change mid-flight.
- Read barriers for relocating concurrent collectors (Shenandoah, ZGC).
Sub-millisecond GC pauses on multi-GB heaps are now standard.
Region-based / arena
Allocate from a context-bound arena; free the entire arena at once. No collector needed; tracks a region per request/task. Used by Rust allocators, Apache, OCaml's minor heap variant, Zig, server runtimes.
What to choose
- Educational implementation: mark-sweep (cp-14).
- Embeddable, single-threaded: mark-sweep or refcount (CPython).
- Server, GB-scale heap, low-pause: concurrent generational (Hotspot G1, ZGC).
- Soft-realtime / latency-critical: region-based or Shenandoah/ZGC.
- Embedded / no malloc: arena + free list.
Stack maps and codegen integration
cp-17's capstone wires this runtime into the cp-12 JIT. The minimal
addition: every call-site stack-map descriptor, plus a "safepoint
poll" inserted into loop headers. We do that on top of LLVM's
@llvm.gcroot / Statepoints intrinsics rather than reinventing the
metadata format.