03 — Pratt Parsing: Expression Precedence Without Grammar Rules
Recursive-descent parsers grow one function per precedence tier:
parseExpr → parseAddSub → parseMulDiv → parseUnary → parsePrimary.
With 5 levels that's fine. With 15 it becomes a maintenance nightmare
where adding ** (exponentiation) requires touching every existing tier
function to get the associativity and slot correct.
Pratt parsing collapses all expression precedence levels into one function controlled by a numeric binding-power table.
Binding powers
Each operator has a left-binding power (how tightly it binds to the left)
and a right-binding power (how tightly it binds to the right). For left-
associative operators rbp = lbp. For right-associative rbp = lbp - 1.
| Operator | Left BP | Right BP | Assoc |
|---|---|---|---|
= | 1 | 1 | right |
or | 3 | 4 | left |
and | 5 | 6 | left |
== != | 7 | 8 | left |
< <= > >= | 9 | 10 | left |
+ - | 11 | 12 | left |
* / % | 13 | 14 | left |
unary - ! | — | 15 | — |
call ( | 17 | 18 | left |
Higher numbers bind tighter. The loop condition while (lbp(peek) > minBp)
continues consuming operators as long as the next one binds tighter than
the caller's minimum.
The Pratt loop
ExprPtr parseExpr(int minBp = 0) {
auto left = parsePrimary(); // nud: no left context yet
while (lbp(peek()) > minBp) {
Token op = advance();
auto right = parseExpr(rbp(op)); // led: right-denotation recursion
left = makeBinary(op, move(left), move(right));
}
return left;
}
parsePrimary handles prefix forms (numbers, identifiers, (expr),
unary -, !, fn). The loop handles infix forms by taking the current
left side and calling parseExpr(rbp) for the right.
Parsing function calls in Pratt
Function call f(x, y) is an infix operator with a ( left-denotation:
// Inside the while loop, when op.kind == LPAREN:
std::vector<ExprPtr> args;
while (peek().kind != RPAREN) {
args.push_back(parseExpr(0));
if (!match(COMMA)) break;
}
expect(RPAREN);
left = makeCall(move(left), move(args));
No special parseCallExpr function — it's handled by the left-binding
power of ( being high (17/18), making call tighter than any arithmetic.
Why associativity matters
Given a = b = c:
- Right-associativity (rbp = lbp): parses as
a = (b = c)— assignment chains, each one stores the innermost result first. - Left-associativity (rbp = lbp + 1): would parse as
(a = b) = c— trying to assign to a temporary, which is wrong.
Given a - b - c:
- Left-assoc:
(a - b) - c— correct for subtraction. - Right-assoc:
a - (b - c)— wrong.
The binding power table encodes this precisely: no per-operator branches in the Pratt loop.
Testing the expression parser in isolation
Before connecting it to the statement parser, write a small test driver:
std::string src = "1 + 2 * 3 - -4";
Lexer lex(src);
Parser p(lex.scanAll());
auto e = p.parseExpr(0);
// Pretty-print: "(- (+ 1 (* 2 3)) (- 4))" — left-associative and
// unary applied before binary
If the parenthesisation matches your expectations, the binding-power table is correct.