03 — Pratt Parsing: Expression Precedence Without Grammar Rules

Recursive-descent parsers grow one function per precedence tier: parseExpr → parseAddSub → parseMulDiv → parseUnary → parsePrimary. With 5 levels that's fine. With 15 it becomes a maintenance nightmare where adding ** (exponentiation) requires touching every existing tier function to get the associativity and slot correct.

Pratt parsing collapses all expression precedence levels into one function controlled by a numeric binding-power table.

Binding powers

Each operator has a left-binding power (how tightly it binds to the left) and a right-binding power (how tightly it binds to the right). For left- associative operators rbp = lbp. For right-associative rbp = lbp - 1.

OperatorLeft BPRight BPAssoc
=11right
or34left
and56left
== !=78left
< <= > >=910left
+ -1112left
* / %1314left
unary - !15
call (1718left

Higher numbers bind tighter. The loop condition while (lbp(peek) > minBp) continues consuming operators as long as the next one binds tighter than the caller's minimum.

The Pratt loop

ExprPtr parseExpr(int minBp = 0) {
    auto left = parsePrimary();          // nud: no left context yet
    while (lbp(peek()) > minBp) {
        Token op = advance();
        auto right = parseExpr(rbp(op));  // led: right-denotation recursion
        left = makeBinary(op, move(left), move(right));
    }
    return left;
}

parsePrimary handles prefix forms (numbers, identifiers, (expr), unary -, !, fn). The loop handles infix forms by taking the current left side and calling parseExpr(rbp) for the right.

Parsing function calls in Pratt

Function call f(x, y) is an infix operator with a ( left-denotation:

// Inside the while loop, when op.kind == LPAREN:
std::vector<ExprPtr> args;
while (peek().kind != RPAREN) {
    args.push_back(parseExpr(0));
    if (!match(COMMA)) break;
}
expect(RPAREN);
left = makeCall(move(left), move(args));

No special parseCallExpr function — it's handled by the left-binding power of ( being high (17/18), making call tighter than any arithmetic.

Why associativity matters

Given a = b = c:

  • Right-associativity (rbp = lbp): parses as a = (b = c) — assignment chains, each one stores the innermost result first.
  • Left-associativity (rbp = lbp + 1): would parse as (a = b) = c — trying to assign to a temporary, which is wrong.

Given a - b - c:

  • Left-assoc: (a - b) - c — correct for subtraction.
  • Right-assoc: a - (b - c) — wrong.

The binding power table encodes this precisely: no per-operator branches in the Pratt loop.

Testing the expression parser in isolation

Before connecting it to the statement parser, write a small test driver:

std::string src = "1 + 2 * 3 - -4";
Lexer lex(src);
Parser p(lex.scanAll());
auto e = p.parseExpr(0);
// Pretty-print: "(- (+ 1 (* 2 3)) (- 4))"  — left-associative and
// unary applied before binary

If the parenthesisation matches your expectations, the binding-power table is correct.