Step 1 — Instruction Set Design
Goal
Design a bytecode instruction set for MiniLang that:
- Is easy to emit from the AST (one or two opcodes per node).
- Is easy to execute with a tight dispatch loop.
- Encodes operands compactly (typical instruction = 1–3 bytes).
- Leaves headroom for cp-07's call frames, closures, and upvalues.
The output of this step is opcode.hpp: a single enum and its opName() lookup.
Stack Machine vs Register Machine
Two dominant designs:
| Stack machine | Register machine | |
|---|---|---|
| Examples | JVM, .NET CLR, CPython, WASM | Lua 5+, Dalvik, V8 Ignition |
| Operand encoding | Implicit (top of stack) | Explicit (register indices) |
Instruction count for a = b + c | 4 (GetB, GetC, Add, SetA) | 1 (Add a, b, c) |
| Compiler complexity | Low | High (register allocation) |
| Per-op dispatch overhead | Higher (more ops) | Lower |
| Total dispatch overhead | Comparable in practice | Comparable in practice |
We pick stack because (a) compilation is dead simple — every expression node "leaves its result on the stack", and (b) the visitor pattern compiles cleanly to stack opcodes (you don't need to thread destination registers through the visitor).
Byte-Aligned, Variable-Length Encoding
Each instruction is one opcode byte followed by zero or more inline operand bytes:
+--------+--------+--------+
| OPCODE | (operands…) |
+--------+--------+--------+
Operand widths used in cp-06:
- 1 byte for constant-pool indices (max 256 constants per chunk).
- 1 byte for local-slot indices (max 256 locals per function).
- 2 bytes (big-endian) for jump offsets (max ±32 KB jump distance).
The trade-off: variable length means the disassembler has to know each opcode's operand size, but the byte stream is smaller than fixed-width 32-bit encoding (Lua's 4-byte instructions waste bytes on RETURN, POP, etc.).
The cp-06 ISA at a glance
The 32 opcodes break into six functional groups:
Stack/literal: Constant Nil True False Pop
Variables: DefGlobal GetGlobal SetGlobal GetLocal SetLocal
Arithmetic: Add Sub Mul Div Mod Neg
Logic/compare: Not Eq Ne Lt Le Gt Ge
Control flow: Jump JumpIfFalse Loop
I/O: Print
Reserved (cp-07): Call Return Closure GetUpvalue SetUpvalue CloseUpvalue
Notice:
- No dedicated
And/Or. Short-circuit logic compiles to control flow (JumpIfFalse+Pop). See step 6. Loopis just an unconditional backward jump with au16operand (forward jumps are signed via the encoding pattern; backward jumps are explicit because the bytecode is being emitted forward and the target is already known).- No typed arithmetic ops.
Addworks on both numbers (sum) and strings (concat) — the type check at compile time guarantees the runtime knows which to do. Other VMs (JVM withiadd/fadd/…) split by type for performance; we trade that for simplicity. - Reserved opcodes are emitted by the disassembler but never produced by the cp-06 compiler. cp-07 wires them up.
Why 32 Opcodes and Not 200?
Crafting Interpreters' Lox has ~30 opcodes. Python has ~120, Java has ~200. Each extra opcode is one more branch in your dispatch switch — and as opcodes proliferate, micro-ops dilute the hot ones. Modern VMs (V8) actually go the other direction: a few big "polymorphic" opcodes that internally specialise.
For learning, fewer opcodes means less to memorise. cp-08 (TAC IR) and cp-09 (SSA) re-explore the design space; cp-13 (MLIR) lets you define your own dialects.
Self-Check
After this step you should be able to:
- Pick an instruction set design (stack vs register, variable vs fixed) and defend it.
- Predict, for a source like
a + b * c, the exact opcode sequence. - Explain why we don't have
OpAnd. - List which opcodes are reserved and what runtime concept each needs.