Step 07 · Beyond mark-sweep

The collector zoo, roughly in order of complexity:

Reference counting

Every object keeps a counter; increment on assign, decrement on overwrite/scope exit. Free when count hits 0.

  • Pros: deterministic — finalisers run promptly. Memory footprint predictable.
  • Cons: cycles leak (need a backup tracing GC). Atomic refcounts in multi-threaded code are expensive.
  • Users: CPython (with cycle-detector backup), Swift, Rust's Rc<T>.

Mark-sweep (cp-14)

What we built. Simple, non-relocating, fragmenting.

Mark-compact

After mark, slide survivors to one end of the heap (or use a forwarding-pointer Cheney pass). Eliminates fragmentation. Pointers need updating — easier with precise stack maps; impossible with conservative scanning.

Copying (Cheney)

Two equally-sized spaces: from-space and to-space. Allocation bump- allocates in from-space. On GC, walk roots and copy live objects into to-space, leaving forwarding pointers in from-space. Flip spaces. From-space is now garbage; bump pointer resets.

  • Pros: O(live) work, no fragmentation, blazing-fast alloc.
  • Cons: half the heap is always wasted; relocates objects.
  • Users: most young generations (because young set is small).

Generational

Combine: copying for the nursery, mark-compact for the tenured. The Hotspot, V8, and SpiderMonkey GCs are all generational variants of this idea.

Concurrent / incremental

Run mark and/or sweep on a separate thread concurrently with the mutator. Needs:

  • Write barriers (Yuasa/Dijkstra) so the marker doesn't lose objects that change mid-flight.
  • Read barriers for relocating concurrent collectors (Shenandoah, ZGC).

Sub-millisecond GC pauses on multi-GB heaps are now standard.

Region-based / arena

Allocate from a context-bound arena; free the entire arena at once. No collector needed; tracks a region per request/task. Used by Rust allocators, Apache, OCaml's minor heap variant, Zig, server runtimes.

What to choose

  • Educational implementation: mark-sweep (cp-14).
  • Embeddable, single-threaded: mark-sweep or refcount (CPython).
  • Server, GB-scale heap, low-pause: concurrent generational (Hotspot G1, ZGC).
  • Soft-realtime / latency-critical: region-based or Shenandoah/ZGC.
  • Embedded / no malloc: arena + free list.

Stack maps and codegen integration

cp-17's capstone wires this runtime into the cp-12 JIT. The minimal addition: every call-site stack-map descriptor, plus a "safepoint poll" inserted into loop headers. We do that on top of LLVM's @llvm.gcroot / Statepoints intrinsics rather than reinventing the metadata format.