Step 3 — Compiling Functions as Nested Compilers
Goal
Extend the cp-06 single-chunk compiler so that fn foo(...) { ... } emits a
separate Function with its own chunk and its own local-variable bookkeeping
— while staying able to resume compiling the outer code afterwards.
The Mental Model
A function body is just another little program. When the parser hands the
compiler a FnDeclStmt, the compiler temporarily switches its target from the
current chunk to a fresh chunk owned by a new Function. When the body
finishes, the compiler:
- Emits
Nil; Return(so a body without an explicitreturndoes the right thing — see step 5 for control-flow specifics). - Pops back to the outer compiler state.
- Records the new
Functionas a constant in the outer chunk's constant pool. - Emits
Closure <const-ix>at the outer cursor, which loads the function value onto the operand stack. - Stores that value as a global or as a new local in the outer scope.
Crucially the outer compiler doesn't need to know anything about the inner body — it just sees a single opaque value.
State
class Compiler {
struct Local { std::string name; int depth; bool isConst; };
struct FunctionState {
FunctionPtr fn;
std::vector<Local> locals;
int scopeDepth = 0;
bool isScript;
};
std::vector<FunctionState> states_;
Chunk& chunk() { return states_.back().fn->chunk; }
std::vector<Local>& locals() { return states_.back().locals; }
int& scopeDepth() { return states_.back().scopeDepth; }
};
The whole "current compilation context" is the top of states_. Push to
enter a function, pop to leave.
void pushFunction(std::string name, int arity, bool isScript) {
auto fn = std::make_shared<Function>();
fn->name = std::move(name);
fn->arity = arity;
FunctionState fs;
fs.fn = fn;
fs.isScript = isScript;
// Reserve slot 0 for the function value itself (matches the VM's call ABI).
fs.locals.push_back({"", 0, true});
states_.push_back(std::move(fs));
}
FunctionPtr popFunction() {
auto fn = states_.back().fn;
states_.pop_back();
return fn;
}
That reserved slot 0 is the link to step 2 — the runtime puts the callable there, and the compiler must not accidentally allocate it to a user variable.
Compiling a FnDeclStmt
void visit(FnDeclStmt& s) override {
pushFunction(s.name, s.params.size(), /*isScript=*/false);
for (auto& p : s.params) addLocal(p, /*isConst=*/false, s.line);
beginScope();
for (auto& stmt : s.body) stmt->accept(*this);
endScope();
emit(Op::Nil);
emit(Op::Return);
auto fn = popFunction();
// Outer scope: load the function value as a constant, then bind it.
uint8_t ix = makeConstant(Value::makeFn(fn));
emit(Op::Closure); emit(ix);
if (scopeDepth() == 0) {
uint8_t nameIx = makeConstant(Value::makeStr(s.name));
emit(Op::DefGlobal); emit(nameIx);
} else {
addLocal(s.name, /*isConst=*/true, s.line);
}
}
A few things worth noting:
- We pass
isConst=truefor the binding itself butisConst=falsefor the parameters — assigning to a parameter inside its function body is legal. - The body opens its own block scope so
endScope()cleans up anylets declared inside; the parameters are above this scope and persist for the entire function (correctly). Op::Closureis currently a synonym forOp::Constant. We give it a distinct opcode so cp-12 can graft upvalue handling on without touching every call site.
Why addLocal(p, ...) Just Works
The cp-06 local table is indexed by insertion order, which matches the
runtime slot numbering. Because we reserved slot 0 in pushFunction,
the first parameter ends up at slot 1, the second at slot 2, … exactly what
the call ABI delivers.
Forbidding Closure Capture (for now)
Without an upvalue system, inner can't see outer's local a:
fn outer(a) {
fn inner() { return a; } // ← capture
return inner();
}
The compiler must detect this at compile time and refuse, rather than emit broken bytecode. Helper:
bool isOuterLocal(const std::string& name) {
for (int i = (int)states_.size() - 2; i >= 0; --i) {
const auto& ls = states_[i].locals;
for (int j = (int)ls.size() - 1; j >= 1; --j)
if (ls[j].name == name) return true;
}
return false;
}
IdentExpr and AssignExpr consult isOuterLocal after their normal local
lookup misses but before they fall back to globals. If true, they emit a
diagnostic pointing the user to cp-12.
<script> Is a Function Too
Result compile(Program& p) {
pushFunction("<script>", 0, /*isScript=*/true);
for (auto& s : p.statements) s->accept(*this);
emit(Op::Nil); emit(Op::Return);
auto script = popFunction();
return Result{script, diagnostics_};
}
Everything composes. No special case for top-level — the VM just calls
<script> like any other function.
Compiling CallExpr
void visit(CallExpr& e) override {
e.callee->accept(*this); // pushes <fn>
for (auto& a : e.args) a->accept(*this); // pushes args
if (e.args.size() > 255)
error(e.line, "too many arguments to a single call (>255)");
emit(Op::Call);
emit(uint8_t(e.args.size()));
}
The shape on the stack at Op::Call N is exactly what callValue expects —
this is how the static side and runtime side cooperate.
Compiling ReturnStmt
void visit(ReturnStmt& s) override {
if (states_.back().isScript)
error(s.line, "'return' outside a function");
if (s.value) s.value->accept(*this);
else emit(Op::Nil);
emit(Op::Return);
}
Pitfalls
- Forgetting the reserved slot 0. Parameters get the wrong slot numbers.
pushFunctionafter starting to emit prelude. The freshFunctionState's chunk is empty by design; emit nothing into it before the body.- Capturing the inner
Chunk&reference acrosspushFunction/popFunction.states_.push_backcan reallocate the vector — always go throughchunk()/locals()accessors.