Step 6 — Extension Exercises
Goal: consolidate the concepts by extending the evaluator in small, focused ways. Each extension is self-contained — pick one or more. Solutions live nowhere in this repo on purpose; struggle is the point.
Exercise 1 — Add % (Modulo)
Difficulty: ★☆☆☆☆ (1 minute)
Add % with the same precedence as * and /.
Hints:
- Add
PercenttoTokenKind. - Add the lexer case (
'%'). - Add the parser check inside
parseTerm's while-loop condition. - Add the evaluator case: use
std::fmod(not%— operands aredouble).
Verify:
10 % 3 = 1
10 % 3 % 2 = 1 (left-assoc)
Exercise 2 — Add ^ (Exponentiation, Right-Associative)
Difficulty: ★★☆☆☆
Add ^ with higher precedence than * and right-associativity.
Hints:
- Add a new precedence level between
termandfactor: call itpower. - Grammar:
term = power { ('*' | '/') power } power = factor [ '^' power ] ; recursion on the right ⇒ right-assoc - Note the
[ ... ]— an exponent may be absent. Thepowerbody recurses intopower, not loops.
Verify:
2 ^ 3 = 8
2 ^ 3 ^ 2 = 512 (i.e. 2 ^ (3 ^ 2) = 2 ^ 9 — NOT (2^3)^2 = 64)
2 * 3 ^ 2 = 18 (^ binds tighter than *)
Exercise 3 — AST Pretty-Printer
Difficulty: ★★☆☆☆
Add a Printer : ExprVisitor<std::string> that returns a string representation. Two flavors:
(a) S-expression:
2 * (3 + 4) → (* 2 (+ 3 4))
(b) Indented tree:
BinaryExpr(*)
├── NumberExpr(2)
└── BinaryExpr(+)
├── NumberExpr(3)
└── NumberExpr(4)
This exercise proves the Visitor pattern: no AST classes change, just a new visitor. Production debug tools (-ast-dump in Clang) do exactly this.
Hint: accept is currently ExprVisitor<double>-only. Either:
- Templatize
accept(template<typename R> R accept(ExprVisitor<R>&)), or - Add a separate
accept_string(ExprVisitor<std::string>&), or - Use a non-visitor
print(Expr&)function with aswitchon a tag (simpler).
Exercise 4 — Source Locations in Errors
Difficulty: ★★★☆☆
Make errors report the column they occurred at:
> 1 + ?
^
parse error: expected number, '(' or '-' (got ERROR)
Hints:
- Add
std::size_t postoToken. - Lexer records
pos_at the start of each token. ParseErrorcarries the position. The catch site prints the source line + a^at that column.
This is a tiny taste of cp-15 (Diagnostics), where we'd add file:line:col, fix-it hints, and color.
Exercise 5 — Constant Folding Pass
Difficulty: ★★★☆☆
Write an Optimizer : ExprVisitor<ExprPtr> that returns a new AST with constants folded:
BinaryExpr(+, NumberExpr(2), NumberExpr(3))
→ NumberExpr(5)
If both children are NumberExpr, compute the result; otherwise return a new BinaryExpr with optimized children.
Verify by:
- Printing the AST before and after.
- Confirming the evaluator still returns the same number.
This is your first compiler optimization pass. The same pattern appears in LLVM's -instcombine, JVM's C1 constant folding, and Clang's EvaluatedExprVisitor.
Exercise 6 — Disallow Trailing Garbage
Difficulty: ★☆☆☆☆
The parser already does this (parse() checks for Eof). Confirm with:
./eval "1 + 2 3"
# parse error: unexpected token '3' after expression
Now: make the error message highlight where the garbage starts — combine with Exercise 4.
Exercise 7 — REPL History and Last Result
Difficulty: ★★☆☆☆
Make _ in the REPL refer to the previous result:
> 1 + 2
3
> _ * 4
12
Hint: the REPL keeps a lastResult variable. The lexer recognizes _ as a special token. The parser allows it as a factor.
Exercise 8 — Bench Tree-Walking vs std::function
Difficulty: ★★★★☆ (more about C++ than compilers)
Build the tree once, evaluate it 1M times. Measure with <chrono>:
- via the Visitor (
accept+ virtual); - via a tagged switch (replace virtuals with
kindswitch); - via
std::variant+std::visit.
Predict which is fastest. Verify. Discuss why.
(Spoiler: tagged switch usually wins by ~2× on hot trees because the branch predictor can latch on. Virtual dispatch loses to indirect-branch mispredicts. This is the exact reason bytecode VMs win over tree-walkers.)
Done?
When you've internalized the concepts (even without doing every extension):
- Mark this lab complete.
- Move to
../cp-03-minilang-frontend/— where the language grows to include variables, control flow, functions, and we switch to Pratt parsing.