cp-06 — Bytecode Compiler (AST → Stack-VM Chunks)
Status: ✅ Implemented.
Replaces the tree-walking model with compile-then-run: AST → flat array of bytecodes ("chunk"). The chunks are executed in cp-07.
What's Built
Openum — a 32-instruction bytecode ISA: stack manipulation, globals/locals access, arithmetic/logic/comparison, control flow (JUMP,JUMP_IF_FALSE,LOOP), I/O, plus reserved opcodes (CALL,RETURN,CLOSURE, upvalues) that cp-07 will activate.Chunk— bytecode array + deduplicated constants pool + parallel line table.Compiler— AST visitor (bothExprVisitor<void>andStmtVisitor) that emits bytecode while tracking lexical locals as stack slots.disassembler— human-readable dump for debugging and unit testing.mlcCLI:mlc file.mlcompiles a file and prints the chunk;mlcalone reads stdin.
Architecture
source → Lexer → Parser → Resolver → TypeChecker → Compiler → Chunk
│
└─→ Disassembler → text
The frontend (lex/parse/resolve/typecheck) is unchanged from cp-05; we re-use it. The interpreter was deleted. The new backend stages are Compiler and Disassembler. The tree-walker's Environment chain is gone — locals are stack slots, globals live in a (future) runtime hash table keyed by name strings interned in the constants pool.
Reading Order
CONCEPTS.md— stack machines, bytecode design, operand encoding, why this is faster than tree-walking.steps/01-instruction-set-design.mdsteps/02-the-chunk.mdsteps/03-emit-helpers-and-jumps.mdsteps/04-locals-vs-globals.mdsteps/05-control-flow.mdsteps/06-short-circuit-logic.mdsteps/07-disassembler-and-testing.mdsrc/cpp/— actual code.
Build & Run
cd src/cpp
cmake -S . -B build -G "Unix Makefiles"
cmake --build build -j
ctest --test-dir build --output-on-failure
Then disassemble a program:
echo 'let n = 10; print n * (n + 1) / 2;' | ./build/mlc
Outcomes
After reading the code and steps you can:
- Design a bytecode instruction set from first principles, justifying every operand width.
- Compile a typed AST to a flat, executable form using a single forward pass.
- Encode
if/else,while, and short-circuit&&/||using only conditional jumps. - Resolve identifier references to stack slots (locals) vs hash lookups (globals).
- Disassemble chunks for debugging and assert on the byte stream in unit tests.
- Articulate the trade-offs between stack VMs (this) and register VMs (Lua, Dalvik).
- Identify what's deferred to cp-07 (call frames, closures,
CALL/RETURN, GC) and why each requires a runtime.
Limitations (revisited in cp-07)
- No execution. We compile, we disassemble, we stop. The VM is cp-07's job.
- No function bodies, calls, or
return. Closures need call frames and upvalues — both runtime concepts. - Constants are capped at 256 per chunk (1-byte index). cp-07 will add
CONSTANT_LONGwith a 3-byte index for chunks that need more. - No source spans for error reporting beyond line numbers. cp-15 expands this.