Abstracting remote pagination through TypeScript generators

kbirgerdev

Kirill Birger

Posted on February 18, 2023

Abstracting remote pagination through TypeScript generators

Background

One of my passions is to write code which is clean and maintainable. While there are several things that go into writing code that is easy to understand and update, one major one that I coach people on is complexity.

As engineers, we face complex tasks on a daily basis, and it is easy to embrace the chaos. However, I think we've all been in a situation where we are asked to add a feature to some code only to find functions that span several screens, with a dozen variables, and types like PersonObject | boolean.

One thing that can help us with such code is to encapsulate complexity using any of a number of patterns that we have been exposed to. However, some things are easier to abstract than others.

Overview

In this post, I'd like to propose to you a nice technique for dealing with a paginated API when you have to iterate through an indeterminate number of results.

While pagination is a great solution to retrieving variable amounts of data from an api. Dealing with pagination logic is not the most challenging task many readers will face, but it is also an unnecessary distraction in most cases.

Therefore, we will discuss a technique which lets you conceal that logic from downstream code using a JavaScript/TypeScript concept known as generators. The code in this post is in TypeScript, but the idea is equally applicable to JavaScript projects.

Generators

This topic is widely covered on the web and in other blog posts, so I will be brief about the overall concept and focus more on a specific set of use cases. If you want a deep dive into Generators, I recommend reading the following post first:

dev.to: Generators in Typescript
This is a great post explaining the details

Additional resources are found at the end.

Brief overview

We are all likely familiar with code resembling the following:

const people: Person[] = fetchPeople();

for(const person of people) {
   printPerson(person);
}
Enter fullscreen mode Exit fullscreen mode

Because people is an array, it is an Iterable, and we can use the for-of syntax to iterate over it. However, arrays are not the only iterable type in JS/TS.

You can implement a generator function, which will execute normally until it hits a yield directive. At this point, function execution will pause, and return (yield) a value. Using language functionality, the function execution can be resumed to yield further values. Consider the following

function* fetchPeople() {
  yield 'Alice';
  yield 'Bob';
}

for(const person of people) {
  console.log(person);
}

// Prints:
// Alice
// Bob
Enter fullscreen mode Exit fullscreen mode

The * operator following the function keyword tells our parser that this function returns a generator. TypeScript would define the return type of this function as Generator<string, void, unknown>.

Recent versions of TypeScript also support AsyncGenerator

function* fetchPeopleAsync() {
  yield Promise.resolve('Alice');
  yield Promise.resolve('Bob');
}

for await(const person of people) {
  console.log(person);
}

// Prints:
// Alice
// Bob
Enter fullscreen mode Exit fullscreen mode

I will provide links at the bottom of this post that go into this topic in more depth.

Conventional pagination code

This is the definition of a Widget:

export interface Widget {
  id: number;
  name: string;
  price: number;
}
Enter fullscreen mode Exit fullscreen mode

Let's say we have a repository class for retrieving Widgets from a remote data source. There are tens of thousands of widgets, so it is impractical to return all of them at once. You have been assigned a task to retrieve and print every widget that contains the word "cat" (this is the Internet, after all, and we love cats).

You search the API documentation, and much to your chagrin, there is no filtering capability on the remote API. :sadkitty:

You may end up writing code resembling this

/** 
 * Retrieves an array of `Widget`s from a remote source
 * @param nameFilter string indicating how to filter widget names
 */
async findWidgets(nameFilter: string) {
    // we fetch widgets 3 at a time
    const take = 3;

    let widgets: Widget[] = [];

    let index = 0;
    let total: number | null = null;
    while (widgets.length <= take && (total == null || widgets.length < total)) {
      // take one "page" of widgets
      const page = await this.repository.getSome(index, take);
      // track our total, so we do not make extra requests
      total = page.total;

      // filter the page
      const filteredPage = page.items.filter(w => w.name.includes(nameFilter));
      index += page.items.length;
      // most succint, though not the most efficient way to add several items to an array
      widgets = [...widgets, ...filteredPage]
    }

    // return an ARRAY of widgets
    return widgets;
  }
Enter fullscreen mode Exit fullscreen mode

That's a lot of code. We have to manage the state of iterating over our data source, we have to make sure our conditions are correct for doing so, and we have to use a while loop. If we needed to do something like limit the number of widgets we return, or paginate back to the caller of findWidgets, it gets even more complicated.

We also collect everything into an array and return it, which can be memory intensive.

Adding a generator

As opposed to arrays, which are eagerly stored in memory in their entirety, Generators give the caller of your function or method the ability to control the flow of data. That is, in the above example, if the caller of findWidgets needs to look at the results, evaluate them in some way, and may decide that only the first 3 are needed, then we have wasted the time in retrieving all of the remaining widgets, and we have wasted the memory required to store them all in the array.

What if we used a generator for exposing a different API for retrieving widgets?

/**
 * Returns an `AsyncGenerator` for widgets
 * @param bufferSize number of widgets to fetch at a time from remote
 */
async function* getAll(bufferSize: number = 3) {
    let index = 0;
    let total: number | null = null;
    while (widgets.length <= take && (total == null || widgets.length < total)) {
      // take one "page" of widgets
      const page = await this.repository.getSome(index, bufferSize);
      // track our total, so we do not make extra requests
      total = page.total;
      for (const widget of page) {
        yield widget;
      }

      index += page.items.length;
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

This is not our filter function yet, although it resembles it very much. Here we are separating our fetching logic out from our filter. There are a few important things to note:

  • We have removed our return statement
  • We have removed our array, so we no longer store a big buffer in memory
  • We now fetch a page of Widgets and yield them one at a time. This is known as buffering.

We can now write any number of functions that operate on some subset of our widgets iteratively and never repeat paging logic or have to store extra data in memory.

async function findWidgets(nameFilter: string) {
  const matchedWidgets = [];
  for await (const widget of getAll(3)) {
    if(widget.name.includes(nameFilter)) {
      matchedWidgets.push(widget);
    }
  }

  return matchedWidgets;
}
Enter fullscreen mode Exit fullscreen mode

This version of the function will fetch widgets in the background from the generator and gives us a nice iterable interface to scan widgets. It shows that anyone who uses our getAll function has the ability to fetch widgets at their own rate, and only the amount that they want, without having to specify additional parameters to getAll.

Unfortunately, in this version we are still collecting the widgets into an array, which can ge large in memory. But using the techniques discussed so far, we can improve this further.

async function* findWidgets(nameFilter: string) {
  for await (const widget of getAll(3)) {
    if(widget.name.includes(nameFilter)) {
      yield widget;
    }
  }

  return matchedWidgets;
}
Enter fullscreen mode Exit fullscreen mode

Now, we can even compose and chain these functions the same as we do with arrays.

async function* findWidgetsByName(nameFilter: string, widgets: AsyncGenerator<Widget, void, unknown>) {
   for await (const widget of widgets) {
     if(widget.name.includes(nameFilter)) {
       yield widget;
     }
   }
}

async function* findWidgetsByPrice(maxPrice: number, widgets: AsyncGenerator<Widget, void, unknown>) {
   for await (const widget of widgets) {
     if(widget.price <= maxPrice) {
       yield widget;
     }
   }
}

async function catWidgetApp() {
  // For feline lovers on a budget
  const cheapCatWidgets = 
    findWidgetsByPrice(
      25.0,
      findWidgetsByName(
        'cat',
        getAll(3)
      )
    );

  for await (const widget of cheapCatWidgets) {
    // imagine this function uses a stdin library like inquirer to present an interactive prompt to the user asking if they want to buy the widget
    const { shouldBuy } = await promptUser(widget);

    // if the user elects to buy this widget, we do so, and exit. Otherwise, we continue to iterate, possibly fetching more widgets
    if(shouldBuy) {
      await buyWidget(widget);
      return;
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

This example shows the elegance and simplicity of using generators for cases where we would like to abstract away the complexity of extracting data from a data source from the code consuming that data. In the above function, we compose three async generators to produce a final one that we iterate over. I have also illustrated our ability to control the rate of flow arbitrarily from calling code. That is, we do not fetch all of our widgets at once. We may fetch the first set of 3 widgets, and then pause for several minutes, while the user makes a decision. If the user gets through the first 3 widgets and does not want to buy them, only then do we fetch anything else. Best of all, this is completely transparent to the calling code.

Did you enjoy this article? Do you have any other ideas for the usage of Generators? Are there any other subjects you'd like to see me write about? Please comment below!

Further resources on generators

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/AsyncGenerator

https://www.typescriptlang.org/docs/handbook/release-notes/typescript-3-6.html

https://basarat.gitbook.io/typescript/future-javascript/generators

https://www.typescriptlang.org/docs/handbook/release-notes/typescript-2-3.html

💖 💪 🙅 🚩
kbirgerdev
Kirill Birger

Posted on February 18, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related