This post will give a brief intro into how to write a simple parser in Elm.
The thing we will be parsing are Bible references: a Bible reference is a shorthand that let's one quickly look up a specific verse or range of verses in the text of the Bible.
Sounds simple...
What's a reference look like?
A Bible reference can be broken down into a start location and an end location. Each location consists of a book, a chapter and a verse.
So a reference might look like Genesis 1:1 - Exodus 2:1 which tells us to start at the first verse of the first chapter of Genesis and end at the first verse of the second chapter of Exodus.
But the reference might also look like any of these:
Genesis 1 - A whole chapter
Genesis 1:1 - A single verse
Genesis 1:1-20
Genesis 1:20-2:24
Genesis 1-5 - Multiple whole chapters
Genesis 1 - Exodus 5
Genesis 1:1 - Exodus 5:20
Genesis 1:1 - Exodus 5
Genesis 1 - Exodus 5:20
Additionally, some books of the Bible only have a single chapter (e.g. Jude) and, by convention, the chapter number is dropped from the reference. So Jude 2 is the second verse of the first (and only) chapter of Jude, not all of Jude chapter 2.
We'll aim to handle all of these cases when we write our parser.
How do we do parsing in Elm?
elm/parser is a super nice parsing library written by the creator and maintainer of Elm. I won't go into details on it here - there is a nice tutorial and conference talk if you want to dig deeper.
We will parse the Bible reference in two steps:
We parse the string into a list of statements
Then we validate the list of statements to check it is a valid reference
Getting a list of statements
A Bible reference can have a space, a colon, a hypen, a book name and a number, so we define:
typeStatement=BookNameBook|NumInt|Dash|Colon
and a parser to turn a string into a list of statements:
{-| A `List Statement` parser. We use `P.loop` to consume the whole string
-}parser:P.Parser(ListStatement)parser=P.loop[]statementsHelpstatementsHelp:ListStatement->P.Parser(P.Step(ListStatement)(ListStatement))statementsHelprevStmts=P.oneOf[P.succeed(\stmt->P.Loop(stmt::revStmts))|.P.spaces|=statement|.P.spaces,P.succeed()|>P.map(\_->P.Done(List.reverserevStmts))]{-| A `Statement` parser
-}statement:P.ParserStatementstatement=P.oneOf[P.mapBookName(P.oneOfbookTokensList),P.map(\_->Dash)(P.symbol"-"),P.map(\_->Colon)(P.symbol":"),P.mapNumP.int]
With this parser we can now turn a string into a List Statement:
Now we will either have a list of statements, like [Book Genesis, Colon, Num 1] or [Book John, Colon, Num 2, Dash, Num 2], etc. But there is nothing to guarantee that we have a valid collection of statements. For example, we could have [Colon, Colon, Colon] which is obviously not valid, or [Book Genesis, Num 52] which appears to be valid, but Genesis only has 50 books - so it is invalid.
And a function processStatements : List Statement -> Result String Reference that will validate our list of statements. This function is rather large to account for all the potential formats available and to handle single chapter books, but the function is essentially a case statement:
processStatementsHelp:ListStatement->ResultStringReferenceprocessStatementsHelpstmts=casestmtsof-- Gen[BookNamebk]->referencebk11bk(numChaptersbk)(numVersesbk(numChaptersbk))-- Gen 1[BookNamebk,Numch]->ifnumChaptersbk==1thenreferencebk1chbk1chelsereferencebkch1bkch(numVersesbk1)-- truncated for brevity (full function can be seen: https://github.com/monty5811/elm-bible/blob/2.0.0/src/Internal/Parser.elm#L38-L243)-- Genesis - Revelation[BookNamestartBk,Dash,BookNameendBk]->referencestartBk11endBk(numChaptersendBk)(numVersesendBk(numChaptersendBk))[]->Err"No reference found"_->Err<|"No valid reference found"
Now we have a Reference that contains a start book, start chapter, start verse, end chapter and end verse but we haven't checked that all of these are in order (e.g. the reference cannot end before it starts) so we use one last function to validate the reference:
validateRef:Reference->ResultStringReferencevalidateRefref=validateBookOrderref|>Result.andThenvalidateChapterOrder|>Result.andThenvalidateVerseOrder|>Result.andThenvalidateChapterBounds|>Result.andThenvalidateVerseBounds-- see each validate function here: https://github.com/monty5811/elm-bible/blob/2.0.0/src/Internal/Parser.elm#L363
Finally! We have a validated Bible reference!
Note I think it should be possible to move all of this validation inside the parser and do everything in one step, but I think this is a cleaner approach.
Conclusion
This post has shown you how to create a parser in Elm so we can validate Bible references. Hopefully this will help you get started building a parser.
If you don't care about building a parser and just want an elm package to do this for you, then check out monty5811/elm-bible that provides a parser, nice formatting and a compact encoder/decoder.