Step 1 — Operand Stack, Values, and the Dispatch Loop
Goal
Build the minimum VM that can execute a flat sequence of opcodes acting on a
stack of Values. At the end of this step:
push(Constant 3); push(Constant 4); Add; Print;
prints 7.
The Operand Stack
A stack virtual machine evaluates expressions by pushing intermediates to a LIFO buffer and popping them when consumed. This mirrors how a tree-walker recurses, but flattens the call sequence into a linear instruction stream.
class VM {
std::vector<Value> stack_;
void push(Value v) { stack_.push_back(std::move(v)); }
Value pop() { Value v = stack_.back(); stack_.pop_back(); return v; }
Value& peek(size_t off=0) { return stack_[stack_.size()-1-off]; }
};
Important properties:
- The stack is
vector<Value>(notarray<Value, 256>). Real VMs pre-allocate, butvectoris simpler and correct. - Every instruction has a known stack effect — e.g.
Addis-2 / +1. The compiler tracks this implicitly; bugs here manifest as crashes deep in unrelated code because the wrong value is on top.
The Value Union
A VM Value is a tagged sum, NOT a pointer. Cache-friendly, no allocation for
primitives:
enum class ValueKind : uint8_t { Nil, Bool, Number, Str, Fn };
struct Value {
ValueKind kind;
union {
bool b;
double n;
};
std::string s; // outside union — has a non-trivial destructor
FunctionPtr fn; // shared_ptr — Fn case
};
Aside — why
std::stringoutside the union? C++ unions can hold non-trivial members but you must placement-new and manually destruct them. That's a real win for production VMs (every cache miss matters) but adds 40 lines of error-prone code. cp-07 chooses clarity; cp-12 swaps in a tagged 64-bitNaN-boxedpayload for the JIT.
The Dispatch Loop
The dispatch loop is the single hottest piece of code in any interpreter. It reads an opcode, jumps to its handler, executes, repeats:
for (;;) {
auto op = Op(readByte());
switch (op) {
case Op::Constant: push(readConstant()); break;
case Op::Add: {
Value b = pop(), a = pop();
push(Value::makeNum(a.n + b.n));
break;
}
case Op::Print: {
(*out) << pop().toString() << "\n";
break;
}
case Op::Return: return Status::Ok;
// …
}
}
Why a switch?
GCC and Clang compile a dense switch over a uint8_t to a jump table:
one indirect branch per dispatch. It is hard to beat without going to inline
assembly.
What about computed goto?
Computed goto (&&Op_Add labels in a static const void* table[256]) reduces
branch predictor pressure by giving each handler its own predictor entry —
each opcode's "what comes next?" is a separate prediction. Speedups are real
(15–30%) but the code is GNU-only and harder to read. We use switch here
and revisit computed goto when we profile in cp-12.
Reading Operands
Each opcode may consume immediate bytes (operands) from the instruction stream:
uint8_t readByte() { return *ip_++; }
uint16_t readShort() { uint16_t hi = *ip_++; uint16_t lo = *ip_++; return (hi<<8)|lo; }
Value readConstant() { return chunk()->constants[readByte()]; }
The instruction pointer (ip_) is a const uint8_t* into the current chunk's
code buffer. Advancing it past the buffer end is undefined behavior — the
compiler is responsible for terminating every code path with Return.
Stack-Balance Discipline
Every opcode handler in cp-07 obeys a stack discipline:
| Opcode | Pops | Pushes |
|---|---|---|
| Constant | 0 | 1 |
| Add/Sub/... | 2 | 1 |
| Neg/Not | 1 | 1 |
| 1 | 0 | |
| Pop | 1 | 0 |
| Jump | 0 | 0 |
| JumpIfFalse | 0 | 0 |
| Call N | N+1 | 1 |
| Return | 1 | 0 (then push result into caller frame) |
When you introduce a new opcode, document its stack effect first; the compiler tracks scope depth and locals based on this contract.
Try It
cd src/cpp && cmake --build build -j
echo 'print 2 + 3 * 4;' | ./build/mlr --dump
You should see the disassembly of the script chunk followed by 14.
Pitfalls
- Popping in the wrong order. For binary ops the right-hand side is on
top:
b = pop(); a = pop();. Reverse it and1 - 2becomes1. push(pop() + …)evaluation order. C++ does not specify argument evaluation order — sequence the pops into named locals before pushing.- Re-reading from
stack_afterpush.vector::push_backcan reallocate, invalidating references. Always copyValues out before pushing.