Step 4 — Globals (Hash) vs. Locals (Slots) at Runtime
Two Worlds, One Stack
The compiler already decides per identifier whether it is local (resolved during compilation to a slot index) or global (resolved at runtime by name). Step 4 implements the runtime half.
| Kind | Storage | Access cost |
|---|---|---|
| Local | stack_[slotBase + slot] | O(1), 1 load |
| Global | unordered_map<string,Value> | O(1) avg, hash |
The compiler emits GetLocal slot / SetLocal slot for locals (resolved at
compile time), and GetGlobal nameIx / SetGlobal nameIx / DefGlobal nameIx
for globals — where nameIx is an index into the chunk's constant pool whose
value is a Value::makeStr(name).
VM Side
case Op::DefGlobal: {
Value name = readConstant();
globals_[name.s] = pop();
break;
}
case Op::GetGlobal: {
Value name = readConstant();
auto it = globals_.find(name.s);
if (it == globals_.end())
throw RuntimeError(currentLine(),
"undefined variable '" + name.s + "'");
push(it->second);
break;
}
case Op::SetGlobal: {
Value name = readConstant();
auto it = globals_.find(name.s);
if (it == globals_.end())
throw RuntimeError(currentLine(),
"undefined variable '" + name.s + "'");
it->second = peek(); // assignment is an expression; leaves value on stack
break;
}
Why SetGlobal errors if the variable doesn't exist
This distinguishes declaration from assignment. let x = 1; and
var x = 1; declare; x = 2; assigns. Without this check, typos silently
create new globals — exactly the JavaScript footgun we don't want.
DefGlobal, in contrast, unconditionally inserts. If the user shadows an
existing global with another let, the resolver already complained.
Why store names, not numeric ids?
Three reasons:
- REPL friendliness. In an interactive session, each entered statement is a separate compilation. Numeric ids would not survive across compilations.
- Dynamic globals. Future built-ins (
print,clock, FFI bindings) inject themselves intoglobals_by name without coordinating with the compiler. - Cheap. String hashing on short identifiers is a few ns; the access pattern is dominated by cache misses in the hash table, not the hash itself.
Real production VMs (V8, LuaJIT) cache name-id pairs in inline caches at the call site so subsequent accesses skip the hash. cp-15 covers ICs.
Locals — the entire implementation
case Op::GetLocal: {
uint8_t slot = readByte();
push(stack_[frame.slotBase + slot]);
break;
}
case Op::SetLocal: {
uint8_t slot = readByte();
stack_[frame.slotBase + slot] = peek();
break;
}
Two array indirections, zero hashing. This is why locals exist as a separate notion: the dominant performance gap between a "scripting" VM and a "systems" VM is whether identifier resolution is a slot read or a hash probe.
Stack Discipline on Block Exit
When a block scope closes:
void endScope() {
while (!locals().empty() && locals().back().depth > scopeDepth() - 1) {
emit(Op::Pop);
locals().pop_back();
}
--scopeDepth();
}
This issues a runtime Pop for every local going out of scope. At runtime
the stack shrinks back to the size it had at beginScope, restoring the
invariant that stack depth = number of live locals + temporaries currently on
top.
Functions on Globals
Top-level functions live in globals_ like any other value. Function calls
do:
GetGlobal "fact" ; pushes the Fn value
Constant 5 ; pushes the arg
Call 1
Recursion works because GetGlobal happens each time — by the time fact
calls itself, the global table already contains it.
Mutable vs Immutable
The compiler tracks isConst on each Local/FunctionState::locals[i] and
emits a compile-time diagnostic for let-bound writes. The VM is uniform: it
has no notion of const at runtime. This is the standard tradeoff — push
errors as far forward as possible.
Pitfalls
- Forgetting to pop locals in
endScope. The stack grows monotonically through the program; nested blocks would corrupt parent locals' indices. SetGlobalaccepting unknown names. Silent globals are a tooling nightmare. Always requireDefGlobalfirst.- Using
[]onglobals_inGetGlobal.operator[]creates default-constructed entries on miss. Usefindand report the error.