Building a parser combinator: the `char` parser.

0xc0der

0xc0Der

Posted on June 25, 2024

Building a parser combinator: the `char` parser.

In the previous post, an implementation of the parser class have been introduced, and in this post will be about some basic parsers.

If the parsing process broke down to it's simplest components, a pattern will be found in these components that represent the simplest operations the parser can do, then they can be combined to form larger and more complicated patterns.

First, we need the most basic and the most important of all. a parser to match one character.

the char parser

It matches one character in the input string.

We need to define a parser using our Parser class from before.

const char = char =>
    new Parser(state => {
        // logic goes here
    });
Enter fullscreen mode Exit fullscreen mode

Then, we need to match the current character with the given one. here I'll use Regexp to match.

const char = char =>
    new Parser(state => {
        const match = new RegExp(`^${char}$`).test(state.charAt(state.index));
    });
Enter fullscreen mode Exit fullscreen mode

After that, the parser returns a new state with updated position and status.

const char = char =>
    new Parser(state => {
        const match = new RegExp(`^${char}$`).test(state.charAt(state.index));

        return state
            .withStatus(1 << (!match + !match))
            .withIndex(state.index + match);
    });
Enter fullscreen mode Exit fullscreen mode

char takes a "regex" that represents one character as an input. matches the current character with it. then, returns a new state based on the result.

In the coming posts, I'll discuss what exactly is the state, and implement more complex parsers.

for the full code. take a look at

GitHub logo 0xc0Der / pari

More than a simple parser combinator.

pari

More than a simple parser combinator.

install with npm.

npm i pari
Enter fullscreen mode Exit fullscreen mode

usage and basic parsers

you can read the source in src/. it's self documenting and easy to read.

here is a simple overview.

import {
  char,
  firstOf,
  sequence,
  zeroOrOne,
  oneOrMore,
  zeroOrMore
} from 'pari';
// the `char` parser matches one char.
// it take a `regex` that matches exactly one char.

const digit = char('[0-9]');

// `firstOf` parser returns the first match in a list of parsers.

const lowerCase = char('[a-z]');
const digitOrLwcase = firstOf([digit, lowerCase]);

// `sequence` parser matches a list of parsers in sequence.

const hex = char('[0-9a-fA-F]');
const byteHex = sequence([char('0'), char('x'), hex, hex]
Enter fullscreen mode Exit fullscreen mode

thanks for reading 😄.

💖 💪 🙅 🚩
0xc0der
0xc0Der

Posted on June 25, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related