02 — The AST

The AST is the contract between the parser and every downstream phase. Get it right and the interpreter, type checker, resolver, and codegen all become straightforward tree walks. Get it wrong and every phase carries workarounds.

Two-tier design: Stmt + Expr

MiniLang separates the grammar cleanly into statements (things with side effects but no value) and expressions (things that produce a value). Some languages blur this line (expressions as statements, last- expression-is-the-return-value, etc.); MiniLang keeps them distinct so the AST shape guides the interpreter.

Statement nodes

struct LetStmt  { std::string name; bool immutable; ExprPtr init; int line; };
struct AssignStmt{ std::string name; ExprPtr value; int line; };
struct IfStmt   { ExprPtr cond; StmtPtr then; StmtPtr else_; int line; };
struct WhileStmt{ ExprPtr cond; StmtPtr body; int line; };
struct ReturnStmt{ ExprPtr value; int line; };
struct PrintStmt { ExprPtr value; int line; };
struct ExprStmt  { ExprPtr expr; int line; };
struct BlockStmt { std::vector<StmtPtr> body; int line; };

Expression nodes

struct NumberExpr { double value; int line; };
struct StringExpr { std::string value; int line; };
struct BoolExpr   { bool value; int line; };
struct NilExpr    { int line; };
struct VarExpr    { std::string name; int line; };
struct UnaryExpr  { TokKind op; ExprPtr operand; int line; };
struct BinaryExpr { TokKind op; ExprPtr left; ExprPtr right; int line; };
struct CallExpr   { ExprPtr callee; std::vector<ExprPtr> args; int line; };
struct FnExpr     { std::vector<std::string> params; StmtPtr body; int line; };

FnExpr is an anonymous function literal (fn(x, y) { ... }). Named functions (fn foo(x) { ... }) desugar into let foo = fn(x) { ... } at parse time.

Ownership with unique_ptr

Every node owns its children:

using ExprPtr = std::unique_ptr<Expr>;
using StmtPtr = std::unique_ptr<Stmt>;

No parent pointers, no shared ownership. The AST is a tree (DAG-free), so a single-owner, depth-first-owned hierarchy matches its shape exactly. Destruction is recursive and automatic when the root goes out of scope.

The Visitor pattern

Every downstream phase (interpreter, resolver, type checker) needs to walk the AST without modifying the node types. The Visitor pattern provides that extension point:

// Returns T per expression node.
template<typename T>
struct ExprVisitor {
    virtual T visitNumber(NumberExpr&) = 0;
    virtual T visitVar(VarExpr&) = 0;
    virtual T visitBinary(BinaryExpr&) = 0;
    // ... one method per expression node kind
};

Each expression node implements:

template<typename T>
T Expr::accept(ExprVisitor<T>& v);

The interpreter inherits ExprVisitor<Value>, the type-checker inherits ExprVisitor<TypePtr>, the printer inherits ExprVisitor<std::string>. Adding a new pass doesn't change any AST file.

Line numbers on every node

Every node stores an int line. This is the source line where the node began. Passes that emit errors use it:

throw RuntimeError("[line " + std::to_string(node.line) + "] ...");

A production compiler would store a full source span (start/end offset); a line number is sufficient for teaching and for cp-03–cp-05 diagnostics.

Design check: where do FnStmt and FnExpr differ?

fn foo(x) { ... } is syntactic sugar for let foo = fn(x) { ... }. At parse time the parser sees the fn keyword followed by an identifier, converts it to a LetStmt wrapping a FnExpr, and the rest of the compiler never needs a FnDecl node. This simplification:

  • Keeps the namespace of a function as a variable binding (consistent with let semantics).
  • Makes closures and first-class functions automatic — a fn literal is just a value.
  • Means the resolver treats function declarations identically to variable declarations (both create a binding; both allow shadowing at new scope).