The road to the return of the investment
Vincenzo Chianese
Posted on May 27, 2020
This series is about sharing some of the challenges and lessons I learned during the development of Prism and how some functional concepts taken from Haskell lead to a better product.
Note: As of January 2021, I no longer work at Stoplight and I have no control over the current status of the code. There is a fork on my GitHub account that represents the state of the project when I left the company.
In the previous post I introduced fp-ts in Prism with the logging as the primary use case. In this post we'll take a look at how the usage of fp-ts slowly spread in the whole codebase, how we misunderstood some concepts, how some of the coworkers took the adoption of fp-ts
and how it helped us refactor problematic parts.
First Expansion: Router
Time passed after the merging of the PR introducing fp-ts
in Prism; in the meantime teams in Stoplight were reshuffled a little bit. As a result, I got a new teammate on the project. Curiously, he was previously working on the initial new Prism design; then he was reallocated somewhere else when I took Prism and now he was coming back.
Essentially, I had a fresh member to onboard on the new direction I wanted to give to the code base. I quickly realized this was an incredible occasion to show the company that picking up functional concepts is not an impossible mission and I wanted to play my cards in the most efficient way.
As the first step for the onboarding, I decided to let my new comrade review a PR I would write that would migrate a component to a functional approach.
From there, I would then observe his reactions and of course answer his questions.
This time identifying the next possible candidate component to refactor was easy. As I explored in the part 1, Prism has the following components:
- Router
- Input Validator
- Negotiator
- Output Assembler
- Output Validator
The negotiator was partially done already in the first PR introducing fp-ts in Prism, and I was well aware that the validation (both input and output) would require a major refactor since they were all state-class based and objectively complicated more on this later.
I decided to go with the router. Being the first part in the whole flow it would have almost no dependencies from the previous steps, meaning that there would not be plumbing code and/or weird wrappers to match inputs and outputs. Further, its logic was not complicated and the refactor was exclusively to bring it into the functional world, with no changes on its behaviours; this way my comrade would only review effective fp-ts
related changes.
Expand fs-ts in Prism's router #402
The following PR extends the usage of fp-ts to the routing package as well by basically making sure it does not throw exceptions anymore, but rather use the Either
object to express an error object.
With this — the router and the mocker finally compose because the type match (they both return an Either<Error, T>
.
Extend the Either
usage to the router was indeed the easy part: https://github.com/stoplightio/prism/pull/402/files#diff-f9a10b37616fb5669ecd5218fc8535c9L16
The problem started when I started to integrate and try to compose the new function in the mega-file-to-split:
-
The whole flow is sync apart from the edge case when we need to employ the forwarder, and this requires an additional abstraction layer https://github.com/stoplightio/prism/pull/402/files#diff-47c5dc2d65fd624c869f5f08d0cfb56aR45
-
What's really preventing from having a clean and functional flow is the validation process that's basically creating an empty array, giving it to the mocker and expecting to receive a filled array. This forces me to keep some stuff here and some stuff there; if the mocker could just return the validations, that'd improve the code a lot.
-
In order to keep the API compatible with what we have, I have to do some wrapping I'd like to avoid https://github.com/stoplightio/prism/pull/402/files#diff-47c5dc2d65fd624c869f5f08d0cfb56aR98
That said, the funny thing is that, although this Pull Request is meant to be an improvement, you can argue that the code is effectively uglier than it is. (Well I do not think it is, but you mileage may vary)
The good news though is that — I'm not sure if you remember, we were discussing on how to refactor this part and nobody (me included) really came up with good ideas.
By trying to extend the functional parts to the router — I now know exactly what needs to be done and how to move forward. This is freaking awesome, to be honest.
The conversation was not that long and chatty as the first one. I also remember there was almost no conversation at all in our internal Slack channel.
It is hard to tell why exactly. It could either be because the team assimilated the concepts or maybe they "resigned" on the fact that this was happening and so arguing would not have changed a lot.
I find the first one very improbable and I would say the truth is in between but clearly leaning on the latter hypothesis. The regret I have today is not asking this explicitly instead of taking advantage of the situation to merge the PR in right away.
My teammate observed:
the code looks complicated because it is long and deeply nested
It is interesting because the code has been long and deeply nested since forever. fp-ts
made that thing visible to a point it could not be ignored anymore. We’ll see an example of a successful refactor later.
The feedback I was receiving in our internal channels was that generally the code would look dirtier than it was previously. This was mostly happening at the "edges" between the regular code and the functional one because of required bridging to maintain compatibility with the current Prism functionality.
For instance, there was a function of Prism that could have thrown an exception in case something went wrong.
function foo() {
// a lot of code
if (!condition)
throw new Error('This is not ok');
// a lot of code again
if (!anotherCondition)
throw new Error('This is not ok');
}
When such a part got refactored to use Either<Error, T>
exceptions wouldn't be thrown at all. On the other hand, whoever would have called this function might have relied on the thrown exception. As long as all the callers wouldn't have been refactored, foo
would always have to ultimately throw. This is what I called "bridging".
For the foo
function, the bridging would probably look like this
import * as E from 'fp-ts/lib/Either';
import { pipe } from 'fp-ts/lib/pipeable';
Import { identity } from ‘lodash’;
function foo() {
pipe(
operation1(arg1, arg2),
E.chain(result => operation2(result.outstandingBalance)),
E.chain(operation3),
+ E.fold(error => { throw error }, identity)
);
}
There were cases with some monads where the bridging code would look even uglier. On the positive side, this would clearly communicate to the developer that this function was still impure exclusively because of something relying on the impure behaviour. This facilitated the search for refactoring opportunities significantly.
Return Of Investment: Validation
Finally after some time we got the downpayment of a series of returns of investment given by the employment of fp-ts
in Prism.
I have already stated that validation in Prism is hard, and the way it was initially implemented in Prism made it even harder. We complained and tried to do something about it (with no results) multiple times:
https://github.com/stoplightio/prism/pull/351#discussion_r293254707
You can see that ultimately the whole team would agree that passing on the opportunity would be the best idea for the time being, since it would be too much time consuming.
The real deal was that nobody knew where to start. That piece of code was terrible, but fp-ts
gave me the key to move on and finally refactor that part of the validation.
One of the good things when using category theory constructs is that things tend to naturally compose. It's like having two pieces of code with a magnet at the extremities: they naturally want to bond. Such property suggests to you that, when things do not compose, something is probably not going well.
Let’s take yet another look to some of the Prism components:
- Router (fp-ts-ized)
- Input Validator
- Negotiator (fp-ts-ized)
We were fundamentally in the situation where two pieces that wanted to compose (the router and the negotiator) couldn't because the Validator had not the right interface. The lack of composability became the driving factor that I used to refactor the input validation.
What happened was fascinating: I was about to ask for suggestions in the Slack channel where I used (and still use) to hangout and talk about functional stuff. While writing the message, I wrote the solution without getting any input from outside:
The last sentence I wrote is kind of memorable
I think fp-ts made me understand why the code is so horrible and how I should move forward.
This, indeed, ultimately happened some time ago:
https://github.com/stoplightio/prism/pull/862
When It Went Wrong: Security Checks
This is not a story where we did everything right. It would mean that's either invented or it omits details. Although I am inclined to say we did most of the things right, we clearly made some mistakes along the journey.
One of these was the porting of the security checks. It's essentially the part that checks if the call can go through the negotiator for mocking or being rejected with a 401
.
This part didn't go very well. Although after the conversion the feature was still working correctly from a functionality standpoint, the resulting code was really hard to reason about, resulting in two additional refactors to bring it back to a reasonable state.
Looking at this now I think there were some factors that brought the things out of control:
- Although familiar with functional concepts, the developer working on the security checks didn't grasp enough of it to be able to complete the feature alone. This resulted in frustration on his side and ultimately brought him to switch in the “get it done, no matter what” mode.
- I also had some blanks. For instance, I thought that passing a
Left<T>
as a function argument would be totally legit — it turns out that 99% it is not. - In order to get the feature done and give some relief to the comrade, I lowered my code review standards and merged it in anyway.
This episode costed me a lot of time to clean it up:
https://github.com/stoplightio/prism/pull/804
https://github.com/stoplightio/prism/pull/830
The second PR, although the changes are minimal, took me ages to put together. I still have a vivid remembering of me working on it. The logic behind that was so complicated that I would lose the context quickly and had to restart from scratch. Multiple times.
What are the lessons learned?
- It is inevitable that things will go wrong during any journey. Keep that in consideration and allocate some time to clean up stuff.
- Short term solutions will bring long term conflicts. Short term conflicts will bring long term solutions. I decided to give my coworker a relief by merging something that wasn’t really ok. I had to pay that back with a very high interest rate.
In the next article, we'll respond to some of the FAQ that I have received while talking about this and showing Prism around.
Posted on May 27, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.