How to Build a LR Parser for Math Expressions

Published: Wednesday, March 15, 2023

Greetings, friends! This is my eleventh post in the math expression parser series. In this tutorial, we'll learn how to use the Syntax tool to build a math expression parser! By the end of this tutorial, our parser will support decimal numbers, parentheses, mathematical operations, unary operations, and basic trigonometric functions. We'll also learn how to build abstract syntax trees (AST) for math expressions using a parser generator. Let's begin!

Starting our Math Expression Grammar

In the previous tutorial, we learned how to use the Syntax tool to generate parsers from basic grammars. Now that we have a better understanding of the Syntax tool, it's time to make a more advanced grammar that can be used to generate a math expression parser.

Let's create a new file called calc.g with the following contents:

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | NUMBER     { $$ = Number($1) }
  ;

Run the the following command to parse a test input of 1 + 2 to make sure our grammar is built correctly.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '1 + 2'
# Parsed value: 3

You should see a result of 3 which means our grammar is fine. Let's take a look on how e is derived.

text
Copied! ⭐️
1. e → e '+' e
2.   → 1 '+' e
3.   → 1 '+' 2

4. Accept result
5. Perform semantic action: $$ = $1 + $3 = 1 + 2 = 3
6. Return 3

Handling Multiplication

We got addition working in our grammar, so let's add multiplication.

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '*' e    { $$ = $1 * $3 }
  | NUMBER     { $$ = Number($1) }
  ;

Then, we'll try parsing 1 * 2 using the Syntax tool.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '1 * 2'
# Parsed value: 2

Looks like it still works! But, let's try using both addition and multiplication and see what happens.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '1 + 2 * 3'

We get our first shift-reduce error!

text
Copied! ⭐️
SyntaxError: Found "shift-reduce" conflict at state 5, terminal '*'.

Resolving Shift-Reduce Conflicts

It's a good idea to understand shift-reduce errors because we'll see them often when designing parsers 😉

Shift-reduce conflicts occur when a parser encounters an ambiguous grammar. It doesn't understand whether it should shift a symbol or reduce it down to a symbol. What does this mean? Let's take a look at the semantic part of our grammar again.

text
Copied! ⭐️
e
  : e '+' e    { $$ = $1 + $3 }
  | e '*' e    { $$ = $1 * $3 }
  | NUMBER     { $$ = Number($1) }
  ;

The expression, e, can derive into e + e, e * e, or a number. The problem occurs when we have two expressions inside one big expression. That is, the problem occurs when we try to perform one or more mathematical operations. All of these commands will cause a shift-reduce conflict.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '1 + 2 * 3'

syntax-cli -g calc.g -m lalr1 -p '1 + 2 + 3'

syntax-cli -g calc.g -m lalr1 -p '1 * 2 * 3'

syntax-cli -g calc.g -m lalr1 -p '1 * 2 + 3'

The reason why all of the above commands fail is due to the fact that the parser generator doesn't know what it should do first.

Let's look at 1 + 2 * 3 as an example. The generated parser will go through each token in our input and use a lookahead token to determine what it should do next. However, it cannot decide whether it should reduce the parsed 1 + 2 value to e, or if it should shift further, since 2 itself can be e (by the NUMBER rule), and the parser expects + after e.

Basically, the parser gets confused. It can't figure out whether to do 1 + 2 first or 2 * 3. We need to tell the parser about math associativity and precedence just as we did in Part 3 of this tutorial series.

Associativity involves the order of how we group math expressions within a larger math expression. 1 + 2 and 1 + 3 are small expressions inside the bigger expression, 1 + 2 + 3.

How should the parser evaluate 1 + 2 + 3? Should it do (1 + 2) + 3 or 1 + (2 + 3). For addition between three numbers, it's obvious to us that the order doesn't matter, but the parser doesn't know that.

Similarly, the parser doesn't understand 1 + 2 * 3. Should it do (1 + 2) * 3 or 1 + (2 * 3)? Operator associativity and precedence seems like common sense to us, but a computer doesn't automatically know that. Therefore, we need to insert "hints" into our grammar, so we can remove ambiguity. Update the calc.g file with the following contents:

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+'
%left '*'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '*' e    { $$ = $1 * $3 }
  | NUMBER     { $$ = Number($1) }
  ;

Notice a new section in between the /lex marker and the %% marker. This section is called our "operators" section.

text
Copied! ⭐️
%left '+'
%left '*'

We are assigning a left associativity to both addition and multiplication. This tells the parser that whenever it sees an expression like 1 + 2 + 3, it should do (1 + 2) + 3. Similarly, whenever the parser sees 1 * 2 * 3, it should do (1 * 2) * 3.

IMPORTANT: the order of the associativity matters. Notice how %left '*' is written below %left '+'. That makes the multiplication operation have higher precedence than addition. The lower down the associativity list, the higher the precedence.

It's very common to deal with shift-reduce conflicts, related to math operator associativity and precedence, in parser design. Therefore, many parser generators, including the Syntax tool, provide a way to specify left and right associativity somewhere in the grammar.

Let's now run the Syntax tool using each of these commands again:

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '1 + 2 * 3'
# Parsed value: 7

syntax-cli -g calc.g -m lalr1 -p '1 + 2 + 3'
# Parsed value: 6

syntax-cli -g calc.g -m lalr1 -p '1 * 2 * 3'
# Parsed value: 6

syntax-cli -g calc.g -m lalr1 -p '1 * 2 + 3'
# Parsed value: 5

They should all pass now! 🎉

Handling Parentheses

Let's now add in the ability parse parentheses.

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+'
%left '*'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '*' e    { $$ = $1 * $3 }
  | '(' e ')'  { $$ = $2 }
  | NUMBER     { $$ = Number($1) }
  ;

When a left parenthesis ( is encountered, the parser will automatically recognize that it should derive e into ( e ). Inside the semantic action, the $2 refers to the expression e. Therefore, the parser will run everything inside the parentheses recursively and follow the order of operations we put in our "operators" section.

Remember, the order of operations should be as follows:

  1. Parentheses
  2. Exponents (evaluated from right to left)
  3. Multiplication/Division (evaluated from left to right)
  4. Addition/Subtraction (evaluated from left to right)

Let's parse the input, (1 + 2) * 3 to make sure our calculator parser works correctly.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '(1 + 2) * 3'
# Parsed value: 9

We should get a value of 9 now! See how easy that was? Much simpler than all the work we did when making an implementation of the shunting yard algorithm 😁

Handling Subtraction and Division

Adding subtraction and division to our grammar is super easy. We just need to add - and / in the "operators" section and then make production rules for subtraction and division along with a semantic action.

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+' '-'
%left '*' '/'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | '(' e ')'  { $$ = $2 }
  | NUMBER     { $$ = Number($1) }
  ;

By placing '+' and '-' on the same line, we give them the same precedence. Similarly, '*' and '/' have the same precedence.

We can then run the following commands to test the subtraction and division operations:

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '5 - 2 - 1'
# Parsed value: 2

syntax-cli -g calc.g -m lalr1 -p '10 / 5 / 2'
# Parsed value: 1

Handling Exponentiation

Let's now exponentiation to our math expression parser.

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+' '-'
%left '*' '/'
%right '^'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | e '^' e    { $$ = $1 ** $3 }
  | '(' e ')'  { $$ = $2 }
  | NUMBER     { $$ = Number($1) }
  ;

Exponentiation has higher priority than multiplication, so we list %right '^' below the line containing %left '*'. The exponentiation operation is also right-associative. Therefore, 2 ^ 2 ^ 3 should be treated as 2 ^ (2 ^ 3).

We can confirm this with JavaScript.

js
Copied! ⭐️
const result = 2 ** 2 ** 3;

// 2 ^ 2 ^ 3 = 2 ^ (2 ^ 3) = 2 ^ 8 = 256
console.log(result); // 256

Notice how we used the ** operator in the semantic action for exponentiation.

text
Copied! ⭐️
e '^' e    { $$ = $1 ** $3 }

Remember! Since we're targeting JavaScript by default, the code inside the semantic action after the $$ will run as pure JavaScript code with the $1 and $3 values replaced by the parser, of course.

Let's run the Syntax tool to make sure exponentiation works correctly.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p '2 ^ 2 ^ 3'
# Parsed value: 256

Handling Unary Operations

Next, it's time to handle negative numbers and unary operations. When making math expression parsers, handling unary operations is always awkward or weird. Luckily, most parser generators, including the Syntax tool, provide a helpful way to handle unary operations as well! Let's update the contents of our calc.g grammar file with the following contents:

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+' '-'
%left '*' '/'
%right '^'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | e '^' e    { $$ = $1 ** $3 }
  | '(' e ')'  { $$ = $2 }
  | '-' e      { $$ = -$2 }
  | NUMBER     { $$ = Number($1) }
  ;

Now, try running the following command. Notice that I intentionally added a space before the -2 because our command line might complain about seeing a - directly after a quote '.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p ' -2'
# Parsed value: -2

Looks like everything works! But wait! There's more!

Try running this command:

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p ' -2 ^ 2'
# Parsed value: -4

We get a value of -4 as we'd expect. According to a convention followed by mathematicians, -2 ^ 2 should be treated as -(2 ^ 2). Even WolframAlpha agrees with this treatment.

Using the %prec Modifier

Let's say for some reason we would like -2 ^ 2 to evaluate to 4 instead of -4. We'll probably make some mathematicians sad, but this is for knowledge! We can use the %prec modifier to adjust the precedence of the unary expression to handle such as scenario. Update the calc.g grammar file with the following contents (but get ready to change it back once this example is over 😅).

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+' '-'
%left '*' '/'
%right '^'
%left UMINUS

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | e '^' e    { $$ = $1 ** $3 }
  | '(' e ')'  { $$ = $2 }
  | '-' e      %prec UMINUS { $$ = -$2 }
  | NUMBER     { $$ = Number($1) }
  ;

When we run the Syntax tool again on -2 ^ 2, we'll end up with a result of 4 instead of -4, making some mathematicians flip their desk over. Don't forget to add a space before -2 if your terminal complains about the - appearing directly after '.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p ' -2 ^ 2'
# Parsed value: 4

Alright, the grammar above may look confusing. I was definitely confused when I was learning how to construct parsers. Let me explain what's happening. We added a new statement in the "operators" section for dealing with unary operations: %left UMINUS.

The UMINUS is considered a virtual token. It's just an arbitrary word that we create. We could have called it UAWESOME instead. The important thing to note is that we're using the same UMINUS virtual token in the grammar rule for unary operations.

text
Copied! ⭐️
'-' e      %prec UMINUS { $$ = -$2 }

The %prec symbol is used to specify which precedence the parser should apply, using the UMINUS virtual token. It's like we're telling the parser, "Hey! Handle the unary operation first!"

By placing the %prec UMINUS symbol to the right of the production rule, '-' e, we tell the parser that it should be treated with a specific precedence. The parser will then look at our list of operators and see that UMINUS is at the bottom of the list. This means it has the highest precedence out of all our operators. Using this precedence, the parser evaluates -2 ^ 2 as (-2) ^ 2, which equals 4.

It's up to you to decide how you want to treat -2 ^ 2. I will follow the convention of most mathematicians and WolframAlpha. Therefore, I would like -2 ^ 2 to evaluate to -4.

I only mentioned the %prec symbol for demonstration purposes because there may be times where it's necessary to change the precedence of certain production rules in our grammar. Let's change our grammar back to normal.

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'

/lex

%left '+' '-'
%left '*' '/'
%right '^'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | e '^' e    { $$ = $1 ** $3 }
  | '(' e ')'  { $$ = $2 }
  | '-' e      { $$ = -$2 }
  | NUMBER     { $$ = Number($1) }
  ;

Handling Trigonometric Functions

Next, let's add the ability to parse custom functions. In Part 6 of this tutorial series, we learned how to add three trigonometric functions (sin, cos, and tan) to our math expression evaluator. We'll add the ability to use these three functions in our math expression parser. Let's update the contents of our calc.g grammar file with the following.

text
Copied! ⭐️
%lex

%%

\s+             /* skip whitespace */
\d+             return 'NUMBER'
\w+             return 'ID'

/lex

%{
  function trig(id, value) {
    switch (id) {
      case 'sin':
        return Math.sin(value);
      case 'cos':
        return Math.cos(value);
      case 'tan':
        return Math.tan(value);
      default:
        throw new Error(`Invalid operation: ${id}`);
    }
  }
%}

%left '+' '-'
%left '*' '/'
%right '^'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | e '^' e    { $$ = $1 ** $3 }
  | '(' e ')'  { $$ = $2 }
  | ID '(' e ')'  { $$ = trig($1, $3) }
  | '-' e      { $$ = -$2 }
  | NUMBER     { $$ = Number($1) }
  ;

We can test to check if our trigonometric functions work by using the following commands:

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p 'sin(0)'
# Parsed value: 0

syntax-cli -g calc.g -m lalr1 -p 'cos(0)'
# Parsed value: 1

syntax-cli -g calc.g -m lalr1 -p 'tan(0)'
# Parsed value: 0

Alright, we added something new to the grammar. Right below the /lex marker, we added a section of JavaScript code between the %{ marker and the %} marker. This section is called the "moduleInclude" section in the official Syntax tool documentation. Inside this section, we can insert regular JavaScript code. The Syntax tool will insert this code "as is" inside the generated parser module when we use the --output or -o option.

Inside the "bnf" section where our grammar rules live, we have a new production rule:

text
Copied! ⭐️
ID '(' e ')'  { $$ = trig($1, $3) }

The ID token is defined in the "lex" section at the top of the grammar file using the regular expression, \w+. We're using the ID token to capture the words, sin, cos, and tan. Then, we're calling the trig function defined in the "moduleInclude" section right below the "lex" section.

The value for $1 will be an identifier, which is hopefully either sin, cos, or tan. The value of $3 will be an expression passed into the parentheses. Therefore, we're allowed to pass in inputs such as sin(1 + 2) or tan(1 + 2 * 3).

Then, we pass $1 and $3 inside our trig function defined in the grammar.

js
Copied! ⭐️
function trig(id, value) {
  switch (id) {
    case 'sin':
      return Math.sin(value);
    case 'cos':
      return Math.cos(value);
    case 'tan':
      return Math.tan(value);
    default:
      throw new Error(`Invalid operation: ${id}`);
  }
}

Handling Decimal Numbers

The last thing we might want to do is add the ability to use decimal numbers instead of integers. This is an easy fix. We just need to replace the regular expression we use for the NUMBER token.

text
Copied! ⭐️
^(?:\d+(?:\.\d*)?|\.\d+)        return 'NUMBER'

Our completed grammar should now look like the following:

text
Copied! ⭐️
%lex

%%

\s+                             /* skip whitespace */
^(?:\d+(?:\.\d*)?|\.\d+)        return 'NUMBER'
\w+                             return 'ID'

/lex

%{
  function trig(id, value) {
    switch (id) {
      case 'sin':
        return Math.sin(value);
      case 'cos':
        return Math.cos(value);
      case 'tan':
        return Math.tan(value);
    }
  }
%}

%left '+' '-'
%left '*' '/'
%right '^'

%%

e
  : e '+' e    { $$ = $1 + $3 }
  | e '-' e    { $$ = $1 - $3 }
  | e '*' e    { $$ = $1 * $3 }
  | e '/' e    { $$ = $1 / $3 }
  | e '^' e    { $$ = $1 ** $3 }
  | '(' e ')'  { $$ = $2 }
  | ID '(' e ')'  { $$ = trig($1, $3) }
  | '-' e      { $$ = -$2 }
  | NUMBER     { $$ = Number($1) }
  ;

Let's make sure that decimal values work:

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -p 'sin(3.14)'
# Parsed value: 0.0015926529164868282

Generating Parsers as JavaScript Modules

Using the command line is fine and all, but we'd like to generate a shiny new math expression parser module that we can use in JavaScript projects. We'll create a parser called calc-parser.js using the following command.

shell
Copied! ⭐️
syntax-cli -g calc.g -m lalr1 -o calc-parser.js

Then, we'll create a new file called calc-example.js that imports our new parser.

js
Copied! ⭐️
const parser = require('./calc-parser');

const result = parser.parse('0.5 + 2.5 * 3 - cos(0)');

console.log(result); // 7

We can then execute this script using Node.js:

shell
Copied! ⭐️
node calc-example
# Output: 7

Congratulations! You built your very own math expression parser using a grammar and parser generator! 🎉🎉🎉

Remember how long it took to build a math expression parser and evaluator using the shunting yard algorithm? Good times...

I think it took around four or five of my tutorials before we had a calculator that understood parentheses, decimal values, addition, subtraction, multiplication, division, exponentiation, and three trigonometric functions. Building a math expression parser using the shunting algorithm was rough compared to this, huh?

Once we understand how grammars are made and how parser generators work, building a math expression parser isn't so bad 🙂

Building Abstract Syntax Trees

We can modify the semantic actions in our grammar to produce abstract syntax trees (AST) instead of evaluating math expressions. As we learned in Part 8 of this tutorial series, ASTs provide a convenient intermediate representation (IR) of our math expression.

Instead of evaluating a math expression right away, we delay evaluation until later, so our users can implement their own evaluation logic. Maybe they want to log usage to a server or make balloons appear on the screen when a + or * is encountered. By giving our users access to an AST, it helps them hook into different spots of a math expression and maintain the correct order operations.

Let's create a new grammar file called calc-ast.g and add the following contents to produce AST nodes for each production rule.

text
Copied! ⭐️
%lex

%%

\s+                             /* skip whitespace */
^(?:\d+(?:\.\d*)?|\.\d+)        return 'NUMBER'
\w+                             return 'ID'

/lex

%{
  const functionList = ['sin', 'cos', 'tan'];

  function FunctionExpression(id, value) {
    if (functionList.includes(id)) {
      return {
        type: 'Function',
        name: id,
        value
      }
    }
  }

  function BinaryExpression(operator, left, right) {
    return {
      type: 'BinaryExpression',
      operator,
      left,
      right
    }
  }

  function UnaryExpression(value) {
    return {
      type: 'UnaryExpression',
      value
    }
  }

  function NumericLiteral(value) {
    return {
      type: 'Number',
      value: Number(value)
    }
  }
%}

%left '+' '-'
%left '*' '/'
%right '^'

%%

e
  : e '+' e    { $$ = BinaryExpression($2, $1, $3) }
  | e '-' e    { $$ = BinaryExpression($2, $1, $3) }
  | e '*' e    { $$ = BinaryExpression($2, $1, $3) }
  | e '/' e    { $$ = BinaryExpression($2, $1, $3) }
  | e '^' e    { $$ = BinaryExpression($2, $1, $3) }
  | '(' e ')'  { $$ = $2 }
  | ID '(' e ')'  { $$ = FunctionExpression($1, $3) }
  | '-' e      { $$ = UnaryExpression($2) }
  | NUMBER     { $$ = NumericLiteral($1) }
  ;

Notice how we created helper functions in the "moduleInclude" section for creating AST nodes. Every time e derives to (or reduces to) an expression, we return an AST node. By nature of the grammar, we are recursively evaluating semantic actions. The final result is a handy abstract syntax tree!

Let's run the following command to see the AST for the math expression, 1 + 2 * 3.

shell
Copied! ⭐️
syntax-cli -g calc-ast.g -m lalr1 -p '1 + 2 * 3'

We should see the following AST:

json
Copied! ⭐️
{
  "type": "BinaryExpression",
  "operator": "+",
  "left": {
    "type": "Number",
    "value": 1
  },
  "right": {
    "type": "BinaryExpression",
    "operator": "*",
    "left": {
      "type": "Number",
      "value": 2
    },
    "right": {
      "type": "Number",
      "value": 3
    }
  }
}

Next, we can build a JavaScript module using the following command.

shell
Copied! ⭐️
syntax-cli -g calc-ast.g -m lalr1 -o calc-ast-parser.js

Then, we can create a new JavaScript file called calc-ast-example.js and import our newly created parser.

js
Copied! ⭐️
const parser = require('./calc-ast-parser');

const result = parser.parse('1 + 2 * 3');

console.log(result);

We can execute this script using the following command:

shell
Copied! ⭐️
node calc-ast-example

Then, we should see the AST for 1 + 2 * 3.

js
Copied! ⭐️
{
  type: 'BinaryExpression',
  operator: '+',
  left: { type: 'Number', value: 1 },
  right: {
    type: 'BinaryExpression',
    operator: '*',
    left: { type: 'Number', value: 2 },
    right: { type: 'Number', value: 3 }
  }
}

Evaluating the AST

Since our AST is identical to the one we created in Part 8 of this tutorial series, we can reuse our AST evaluator from that tutorial.

Let's create a new file called calc-ast-evaluator.js and add the following contents.

js
Copied! ⭐️
const parser = require('./calc-ast-parser');

class NodeVisitor {
  visit(node) {
    switch (node.type) {
      case 'Number':
        return this.visitNumber(node);
      case 'BinaryExpression':
        return this.visitBinaryExpression(node);
      case 'UnaryExpression':
        return this.visitUnaryExpression(node);
      case 'Function':
        return this.visitFunction(node);
    }
  }

  visitNumber(node) {
    return node.value;
  }

  visitBinaryExpression(node) {
    switch (node.operator) {
      case '+':
        return this.visit(node.left) + this.visit(node.right);
      case '-':
        return this.visit(node.left) - this.visit(node.right);
      case '*':
        return this.visit(node.left) * this.visit(node.right);
      case '/':
        return this.visit(node.left) / this.visit(node.right);
      case '^':
        return this.visit(node.left) ** this.visit(node.right);
      default:
        throw new Error(`Invalid operation: ${node.operator}`);
    }
  }

  visitUnaryExpression(node) {
    return -this.visit(node.value);
  }

  visitFunction(node) {
    switch (node.name) {
      case 'sin':
        return Math.sin(this.visit(node.value));
      case 'cos':
        return Math.cos(this.visit(node.value));
      case 'tan':
        return Math.tan(this.visit(node.value));
      default:
        throw new Error(`Invalid function: ${node.name}`);
    }
  }
}

const input = '1 + 2 * 3';
const ast = parser.parse(input);
const visitor = new NodeVisitor();
const result = visitor.visit(ast);
console.log(result); // 7

After running node calc-ast-evaluator, we should get the result of 7. Obviously, this approach required a bit more work to evaluate math expressions, but it lets the user, who imports the library, decide how they want to use the AST. Inside each "visitor" function (i.e. visitNumber, visitBinaryExpression, etc.), we could have done something else besides evaluating the math expression. It gives more freedom to the user on how they want to use the AST.

Conclusion

In this tutorial, we learned how to build a grammar that can be fed into the Syntax parser generator. We used the Syntax tool to generate a math expression parser that can be imported into our JavaScript projects. This math expression parser behaves the same as the math expression parser we completed in Part 7 of my tutorial series where we used an implementation of the shunting yard algorithm.

We also learned how to use semantic actions in a grammar file to produce abstract syntax tree nodes for building an entire AST for math expressions. This AST generator was similar to the one we created in Part 8 of my tutorial series. We also learned how to evaluate the AST using the visitor pattern.

If you followed all my previous tutorials, I'm sorry for putting you through all that suffering when parser generators make it super easy to build a math expression parser 😅

Doing things the hard way first makes us appreciate the easy way much more! 😁

In the next tutorial, we'll learn how to build a LL parser and see why it's more challenging than building an LR parser. Until then, happy coding! 🙂