Learning Functional Programming #2: Creating the first Parser

Matthijs groenPosted by Matthijs Groen on 26-2-2021

This post is part 2 of the series “Learning Functional Programming using JavaScript”. Already read part 2? Part 3 of this Blog Post series is out now!

Earlier parts: Part 1.

So in the folder where I created my challenge.js file, I added:

1
2
3
git init .
yarn init
yarn add eslint --dev

eslint is a linting tool for ECMAScript, where you can define your own rules of what you consider to be “good” code.

Now that I added eslint, I could add custom rules, by updating my just generated package.json file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{
  "name": "json-parse-fp",
  "version": "1.0.0",
  "main": "index.js",
  "author": "Matthijs Groen <matthijs.groen@kabisa.nl>",
  "license": "MIT",
  "devDependencies": {
    "eslint": "^5.15.1"
  },
  "eslintConfig": {
    "parserOptions": {
      "ecmaVersion": 2018
    },
    "env": {},
    "extends": "eslint:recommended"
  }
}

In eslint, you normally specify an env in which the code will be run, like a browser, node or worker. This will automatically allow a lot of syntax specific to such an environment. By not specifying an env, I was restricting a lot of standard JS. But in this case even too much. Type objects like String, Array and Number were also disabled, and console.log was also not allowed.

(normally having no console.log is a good thing, it prevents debug or other info from remaining in your code when it is no longer needed).

But since I was not exporting anything from my file, I needed a way to see some output of my code. So I enabled console.log manually.

Additions to the eslint config:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
  "eslintConfig": {
    "globals": {
      "Symbol": "readonly",
      "Array": "readonly",
      "String": "readonly",
      "Number": "readonly",
      "console": "readonly"
    },
    "rules": {
      "no-console": "off"
    }
  }
}

The editor I’m using applies these rules directly, so I would have direct feedback whether I was breaking any of my own rules.

I even added some more rules to prevent cheating:

1
2
3
4
5
6
7
8
9
10
{
  "eslintConfig": {
    "rules": {
      "no-use-before-define": "error",
      "no-eval": "error",
      "no-implied-eval": "error",
      "no-restricted-globals": ["error", "JSON"]
    }
  }
}

So now when I would try to lint:

1
2
3
const data = '{ "foo": 42 }';
JSON.parse(data);
eval("JSON.parse(data)");

I would get the output:

1
2
3
4
  2:1  error  Unexpected use of 'JSON'  no-restricted-globals
  3:1  error  eval can be harmful       no-eval

✖ 2 problems (2 errors, 0 warnings)

Great! Let’s start!

Parsing the first character

Just like the Parser Combinator post, I started by creating a parser that could only parse a specific single character from an input (the letter a). On success, it should return the parsed result and the rest of the input. On failure, it should return an indication that it failed containing the message what it tried to achieve, and why it failed:

1
2
3
4
5
6
7
8
9
10
const FAILED = Symbol("Failed");

const aParser = ([head, ...tail]) =>
  head === "a"
    ? [head, tail]
    : [FAILED, "Error parsing 'a':", `Unexpected '${head}'`];

console.log(aParser("abc")); // [ 'a', [ 'b', 'c' ] ]
console.log(aParser("aabc")); // [ 'a', [ 'a', 'b', 'c' ] ]
console.log(aParser("bcd")); // [ Symbol(Failed), "Error parsing 'a':", "Unexpected 'b'" ]

Using the spread operator ([head, ...tail]) I would separate the first character from the rest, and compare it to the letter “a”. Notice that the tail is automatically transformed into an array of characters.

To change that we could parse more than just the “a”, it would be smart to turn the character we parse into a function argument:

1
2
3
4
5
6
7
8
const parser = (character, [head, ...tail]) =>
  head === character
    ? [head, tail]
    : [FAILED, `Error parsing '${character}':`, `Unexpected '${head}'`];

console.log(parser("a", "abc")); // [ 'a', [ 'b', 'c' ] ]
console.log(parser("a", "aabc")); // [ 'a', [ 'a', 'b', 'c' ] ]
console.log(parser("a", "bcd")); // [ Symbol(Failed), "Error parsing 'a':", "Unexpected 'b'" ]

The output of each console.log would be the same. But is it an improvement?

Yes and no.

Yes, we made the character to parse dynamic. And No, because we changed the function signature.

This last one would seem trivial, but it is not. The reason is that we want to combine the parser (hence: Parser Combinators). To combine parsers, they should all follow the same signature. Data string in, and Result and Remaining out, or an Error with details.

Currying

To get around this issue, we can use a concept, called Currying. As Wikipedia states it:

In mathematics and computer science, currying is the technique of translating the evaluation of a function that takes multiple arguments into evaluating a sequence of functions, each with a single argument.

This would actually mean that our function would not be “data” in, “data” out like I was used to. It would be “data” in, “function” out. Like a function factory.

A mathematical example:

1
2
3
4
5
6
7
8
const add = (a, b) => a + b;
add(1, 3); // 4

const addCurried = (a) => (b) => a + b;
const addOne = addCurried(1); // new function
addOne(3); // 4
addOne(5); // 6
addCurried(7)(4); // 11

This is a really powerful concept. Not only can we create a function that would return a letter parser in the parser example, this also means you can capture some state inside these functions!

Time to update the parser:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
const characterParser = (character) => ([head, ...tail]) =>
  head === character
    ? [head, tail]
    : [FAILED, `Error parsing '${character}':`, `Unexpected '${head}'`];

const aParser = characterParser("a");
const bParser = characterParser("b");

console.log(aParser("abc"));
console.log(aParser("abc")); // [ 'a', [ 'b', 'c' ] ]
console.log(aParser("aabc")); // [ 'a', [ 'a', 'b', 'c' ] ]
console.log(aParser("bcd")); // [ Symbol(Failed), "Error parsing 'a':", "Unexpected 'b'" ]
console.log(bParser("bcd")); // [ 'b', [ 'c', 'd' ] ]
console.log(aParser("")); // [ Symbol(Failed), "Error parsing 'a':", "Unexpected 'undefined'" ]

On desktop browsers you can follow along by opening the console and pasting the code in there.

Hmm that last one is not that nice. It is actually stating it reached the end of the file, so we change it into an EOF error:

1
2
3
4
5
6
7
8
9
10
const characterParser = (character) => ([head, ...tail]) =>
  head
    ? head === character
      ? [head, tail]
      : [FAILED, `Error parsing '${character}':`, `Unexpected '${head}'`]
    : [FAILED, `Error parsing: '${character}':`, "Unexpected EOF"];

const aParser = characterParser("a");

console.log(aParser("")); // [ Symbol(Failed), "Error parsing 'a':", "Unexpected EOF" ]

The error is improved, but now the construction of an error is duplicated. Let’s refactor that one:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const parseError = (target) => (error) => [
  FAILED,
  `Error parsing: '${target}'`,
  error,
];

const characterParser = (character) => ([head, ...tail]) =>
  head
    ? head === character
      ? [head, tail]
      : parseError(character)(`Unexpected '${head}'`)
    : parseError(character)("Unexpected EOF");

const aParser = characterParser("a");

console.log(aParser("")); // [ Symbol(Failed), "Error parsing 'a':", "Unexpected EOF" ]

So now I can create a parser for any one character I would like. The next step would be creating a parser that would be a combination of multiple character parsers.

But, feeling more confident, it was time to enforce more rules:

What is a statement, what is an expression

You can see a statement as a line of code, in JavaScript terminated by a semi-colon (;).

1
2
3
const aVar = 2;
const myFunc = (number) => number * 6;
const result = aVar + 6 * myFunc(3);

This code has 3 statements: The definition of aVar, myFunc and result. The myFunc function, has 1 statement: number * 6.

An expression is anything that can return a value: 2, number, 6, aVar, myFunc(3). So in a single statement, you can use multiple expressions, and combine them even to produce new values. (using addition + or multiply * for example).

So the last line is actually 1 statement, containing 4 expressions.

Since we could do quite some work in a single statement, I wanted to limit the amount of statements for a lambda to 1. With having only 1 statement in a lambda, you can omit the function body and an explicit return statement.

1
2
3
4
5
6
const multiStatementLambda = (a) => {
  const b = 5;
  return a + b;
};

const singleStatementLamda = (a) => a + 5;

So now our eslint rules in the package.json look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
{
  "eslintConfig": {
    "parserOptions": {
      "ecmaVersion": 2018
    },
    "extends": "eslint:recommended",
    "env": {},
    "globals": {
      "Symbol": "readonly",
      "Array": "readonly",
      "String": "readonly",
      "Number": "readonly",
      "console": "readonly"
    },
    "rules": {
      "no-console": "off",
      "no-use-before-define": "error",
      "no-eval": "error",
      "no-implied-eval": "error",
      "no-restricted-globals": ["error", "JSON"],
      "max-statements": ["error", 1, { "ignoreTopLevelFunctions": false }],
      "complexity": ["error", { "max": 3 }],
      "arrow-body-style": ["error", "as-needed"],
      "no-restricted-syntax": [
        "error",
        {
          "selector": "FunctionExpression",
          "message": "Please use Lambda notation () =>"
        },
        {
          "selector": "IfStatement",
          "message": "Try using ternary operator: true ? 1 : 2"
        },
        {
          "selector": "VariableDeclaration[kind=let],VariableDeclaration[kind=var]",
          "message": "Only use constants"
        }
      ]
    }
  }
}

Conclusion

The code so far:

What I learned so far:

What did I like:

What did I not like:

Next time:

Part 3

Part 3 of this Blog Post series is out now!

Matthijs groen

Matthijs Groen

Full Stack Web Developer • Hobby Indie Game Developer Github: matthijsgroen • Twitter: @matthijsgroen

Bij Kabisa staat privacy hoog in het vaandel. Wij vinden het belangrijk dat er zorgvuldig wordt omgegaan met de data die onze bezoekers achterlaten. Zo zult u op onze website geen tracking-cookies vinden van third-parties zoals Facebook, Hotjar of Hubspot. Er worden alleen cookies geplaatst van Google en Vimeo. Deze worden gebruikt voor analyses, om zo de gebruikerservaring van onze websitebezoekers te kunnen verbeteren. Tevens zorgen deze cookies ervoor dat er relevante advertenties worden getoond. Lees meer over het gebruik van cookies in ons privacy statement.