Parsing JSON in ReScript Part I: Prerequisites and Requirements

webbureaucrat

webbureaucrat

Posted on November 17, 2020

Parsing JSON in ReScript Part I: Prerequisites and Requirements

There are few things more satisfying than a slick, readable, and safe JSON parser. It's one of the joys of functional programming. Using a good JSON parsing pipeline can feel like magic. This series seeks to lift the veil and empower readers (and, importantly, my future self) to build their own customizable and extensible parsing libraries. This article, the first of several, will be a skimmable introduction to the subject as I see it.

Prerequisites

Unfortunately, I'm probably not yet a skilled enough writer to write this article for a very junior audience. In order to follow this series, you should be fairly familiar with

  • ReScript syntax
  • functions as parameters
  • Result monads
  • the ReScript Js.Json library

If you aren't familiar with all of these, feel free to read on, and if you get stuck on something, as always, feel free to open an issue or @ me, and I'll take another swing at it.

Examining the need for a custom parsing solution in ReScript

As always, it's reasonable to ask why I'm reinventing the wheel here. I have a few reasons for wanting a custom solution for my use case which I will enumerate here, but while I'm at it, I'd like to just say: I really love parsers. They're fun, and you may find you like them, too.

Why not just use the built-in Js.Json parsing methods?

Let me say, first of all, that I'm going to use Js.Json, but if you try to build a nontrivial parser out of the default Js.Json module, you'll end up with quite a bit of nesting, and that quickly gets difficult to manage. I would want an additional wrapper around Js.Json if my objects had more than just a couple of properties.

Why not use an existing solution?

A casual search of npm shows there are plenty of handy JSON parsing helpers in ReScript (formerly BuckleScript/ReasonML). They're all good! If they suit your use case, you should use one of them. However, there may be times when it doesn't suit your use case. I have had one of those use cases recently.

  1. The API I'm calling used dates in ISO-8601, and I needed them in both ISO-8601 and in posix time. Libraries tend to pick one or the other.
  2. The API I'm calling uses numeric strings instead of numbers.
  3. Most libraries parse data into option monads, but I'd like to have some fairly granular logging so I can quickly tell what failed if my parser isn't configured right or if the API ships a breaking change, so I'd like to use a Result type with a string error message.

The first two of these problems could be resolved if I separated my concerns more--I could have a separate record type that contains all the fields from the API as strings, the way the API presents it, and then define some additional translation layer to go from API models to my data models.

While I understand the argument for doing something like this, I don't think this is always the best route. Parsing date strings and numeric strings into dates and numbers is logic that belongs in the parsing layer, not in some additional, separate business layer.

Defining the requirements of our parsing library

The main point of any parsing library is to use functions to flatten the nested logic required to cover the success and failure cases of the decoding cases of each property. Secondly, as I've said above, I want this library to return a Result type with a nice error message I can decide to log if the parse fails.

I also want it to conform to the ReScript convention of a pipe-first structure, and unlike many pipelines, I'd like to start with some defaults and build the pipeline incrementally.

For reference, here's an example of how I'm using my own library. This is just an abbreviated version of Parsing.res which defines parsers for a few models.

open Belt.Float; // for * multiplication and / division

let initializeRollingCaseRate: Models.rawRollingCaseRate = { 
    dateStr: "", 
    posix: 0., 
    caseRate: 0.
};

let parseRollingCaseRate = (json: Js.Json.t): Belt.Result.t<Models.rawRollingCaseRate, string> => 
    switch Js.Json.classify(json) { 
    | Js.Json.JSONObject(dict) => 
        Belt.Result.Ok(initializeRollingCaseRate) 
        -> Decode.req("date", Decode.str, dict, (obj, dateStr) => { 
            ...obj, dateStr: dateStr |> Js.String.substring(~from=0, ~to_=10)}) //strip time. 
        -> Decode.req("date", Decode.posix, dict, (obj, posix) => { 
            ...obj, posix: posix }) 
        -> Decode.req("cases_rate_total", Decode.numeric, dict, (obj, caseRate) => { 
            ...obj, caseRate: caseRate }) 
    | _ => Belt.Result.Error("Parse error: not an object. "); 
};

let flatLog = (accumulator: array<'t>, item: Belt.Result.t<'t, 'error>) => { 
    switch item { 
    | Belt.Result.Ok(ok) => accumulator |> Js.Array.concat( [ok] ) 
    | Belt.Result.Error(error) => { Js.log(error); accumulator }; 
    }
};

let parseRawRollingCaseRates = (json: Js.Json.t): Belt.Result.t<array<Models.rawRollingCaseRate>, string> => 
    switch Js.Json.classify(json) { 
    | Js.Json.JSONArray(jsons) => 
        Belt.Result.Ok(jsons |> Js.Array.map(parseRollingCaseRate) |> Js.Array.reduce(flatLog, []) ) 
    | _ => Belt.Result.Error("Parse issue: root not array. ")
};
Enter fullscreen mode Exit fullscreen mode

There's a lot going on here (perhaps too much). Just to break it down, the req function takes a Result of a record model, and, if that Result is itself okay, it tries to read (1) the given string-identified property using (2) the given function that goes from a JSON object to a properly parsed member from (3) the given dictionary and then uses that property to update the record using (4) the given function. It returns another Result which can be piped into another Decode.req, and the game begins again. The result is one call to Decode.req for each property of the Models.rawRollingCaseRate I'm decoding in parseRollingCaseRate.

In Conclusion

I hope this has been a useful introduction to my thinking on parsers to contextualize the code I will introduce in the coming posts. The next post will introduce some underlying utilities that will form the building blocks of our parsing library.

💖 💪 🙅 🚩
webbureaucrat
webbureaucrat

Posted on November 17, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related