Break functions apart

For a while now I have been exploring the Clojure programming language. I ended up doing this almost exclusively driven by Rich Hickey's presentations.

Design, composition and performance is among my favorite ones, and there are two specific things that burned in my mind since I saw them:

Design means break things apart in such a way that they can be put back together.

…and

Whenever there is a problem, it's likely because we haven't broken things apart enough.

Since I saw that video, I've kept these in mind every time I would be looking at some piece of code — and it turns out n.2 applies so many times — way more than I was expecting.

This is a small walkthrough of a refactoring opportunity I came across recently while working on a fairly large codebase.

Suppose we have a node type and some functions that grabs a token from somewhere and — in case a specific path is defined, it will need to grab the parent node before continuing the processing:

type AppNode = {
  id: string;
  path: string;
};

declare function extractDataFromNode(n: AppNode): string;

async function getNodeData(
  startNode: AppNode,
  getToken: () => Promise<string>,
  getParentNode: (data: AppNode, token: string) => Promise<AppNode>
) {
  const token = await getToken();

  if (startNode.path === '/p') {
    const parent = await getParentNode(startNode, token);
    return extractDataFromNode(parent);
  }

  return extractDataFromNode(startNode);
}

This function will probably do its job, but frankly speaking, this is a mess in my eyes:

The function name claims to return the node data; on the other hand is also retrieving an authentication token from the system in case is necessary and will happily fallback to the parent node if required.
The function requires two additional function that are used to get the data in case the computation fails for some reason.

When asking/arguing why about the n.2 — the response I'd usually get is that such "pattern" will improve the testability, because I can pass mocks function when testing the code, something along these lines:

describe("returns parent data when path starts with /", () => {
  const tokenMock = jest.fn().mockResolvedValue("token");
  const parentNodeMock = jest.fn().mockResolvedValue({ id: "9", path: "main" });

  const data = getNodeData(
    { id: "0", path: "/s/10" },
    tokenMock,
    parentNodeMock
  );

  return expect(data).resolves.toEqual({}); // assertions
});

I find this not a valid argument; to me this is more a tape-patchy way to fix something that has been put together incorrectly.

Whenever there is a problem, it's likely because we haven't broken things apart enough.

First — let's move the token logic away from this function, and pass it as a regular argument.

async function getNodeData(
  startNode: AppNode,
  token: string,
  getParentNode: (data: AppNode, token: string) => Promise<AppNode>
) {
  if (startNode.path === "/p") {
    const parent = await getParentNode(startNode, token);
    return extractDataFromNode(parent);
  }

  return extractDataFromNode(startNode);
}

By changing the function signature to ask for the token directly instead of taking it on our own, we have effectively moved a non pure operation out of this function. That is a good thing.

At this stage, we might noticed that the token argument is used exclusively to feed the getParentNode function — that we're passing as an argument.

We can do something about it:

- getParentNode: (data: node, token: string) => Promise<node>
+ getParentNode: (token: string) => (data: AppNode) => Promise<AppNode>

Now we can externally bind the token, and get rid of such dependency:

async function getNodeData(
  startNode: node,
  getParentNode: (data: AppNode) => Promise<node>
) {
  if (startNode.path === "/p") {
    const parent = await getParentNode(startNode);
    return extractDataFromNode(parent);
  }

  return extractDataFromNode(startNode);
}

…and then call the token-free function.

const tokenizedGetParentNode = getParentNode("tokenValue");

getNodeData(
  { id: "test", path: "/p/10" },
  tokenizedGetParentNode
);

The code is now inarguably better; still, we're passing the getParentNode — which is not good.

We could reiterate with the same solution we applied above to the getParentNode function, so that we'd be passing the parent node ahead of the time; but it might be a waste of resources: the parent node is only required on certain conditions.

Sometimes sacrificing performances/efficiency for readability is a good call. However for this use case we're going to assume that fetching a parent node is a very expansive operation that we really want to avoid.

Let's break this function apart and reduce its scope to only work with the current node. If the parent node is required, we're going to make this fail and the caller will have to sort this out:

async function getNodeData(startNode: AppNode) {
  if (startNode.path === '/p') {
    return false;
  }

  return extractDataFromNode(startNode);
}

and then the caller can do its job to understand if the parent node call is required or not.

We have now successfully broken the things apart, with all the pieces doing ONE thing. It's now time to put them back together.

declare function getParentNode: (data: AppNode) => RTE.ReaderTaskEither<string, Error, AppNode>;
declare function extractDataFromNode(n: AppNode): string;

function getNodeData(startNode: AppNode) {
  if (startNode.path === "/p") {
    return E.left(new Error("parent node required"));
  }

  return E.right(extractDataFromNode(startNode));
}

const getParentNodeData = (startNode: AppNode) =>
  pipe(getParentNode(startNode), RTE.map(extractDataFromNode))(token);

const node: AppNode = { id: "10", path: "/path" };

return pipe(
  TE.fromEither(getNodeData(node)),
  TE.orElse(() => getParentNodeData(node))
);

Now go, and break your functions apart.

Then break your systems apart.

Then put them back together.

Blog

Break functions apart

Vincenzo Chianese

Join Our Newsletter. No Spam, Only the good stuff.

Related