Learning from My Own Tests

udayrana

Uday Rana

Posted on November 11, 2024

Learning from My Own Tests

This week I added tests to my project codeshift.

GitHub logo uday-rana / codeshift

A command-line tool that translates source code files into a chosen programming language.

codeshift

Codeshift is a command-line tool to translate and transform source code files between programming languages.

codeshift tool demo: translating an express.js server to rust

Features

  • Select output language to convert source code into
  • Support for multiple input files
  • Output results to a file or stream directly to stdout
  • Customize model and provider selection for optimal performance
  • Supports leading AI providers

Requirements

  • Node.js (Requires Node.js 20.17.0+)
  • An API key from any of the following providers:
    • OpenAI
    • OpenRouter
    • Groq
    • any other AI provider compatible with OpenAI's chat completions API endpoint

Installation

  • Clone the repository with Git:

    git clone https://github.com/uday-rana/codeshift.git
    Enter fullscreen mode Exit fullscreen mode
    • Alternatively, download the repository as a .zip from the GitHub page and extract it
  • In the repository's root directory (where package.json is located), run npm install:

    cd codeshift/
    npm install
    Enter fullscreen mode Exit fullscreen mode
  • To be able to run the program without prefixing node, run npm install -g . or npm link within the project directory:

    npm install -g 
    Enter fullscreen mode Exit fullscreen mode

To do this, I chose Jest because it is the most popular testing framework for JavaScript and it is a mature technology, meaning there is plenty of great documentation and examples and a large ecosystem around it.

An alternative I was considering was Vitest, because last time I tried using Jest I had to figure out how to deal with TypeScript and ES modules. I was talking to my friend Vinh who wanted to set up Jest on his project too, and his project used ES module syntax, and so he ran into trouble since Jest support for it is still experimental. But since my project uses CommonJS syntax and Jest is still much more widely used, I decided to stick with it.

Setting up Jest

  1. I installed Jest with npm: npm i -D jest

  2. I configured Jest globals in ESLint:

    ...
    import globals from "globals";
    
    export default [
      ...
      { languageOptions: { globals: { ...globals.node, ...globals.jest } } },
      ...
    ];
    
  3. I installed @types/jest for VSCode Intellisense: npm i -D @types/jest

  4. I created a file called jest.config.js and set the verbose option to true. This makes it so that Jest reports on each individual test during the run.

    /** @type {import('jest').Config} */
    const config = {
      verbose: true,
    };
    
    module.exports = config;
    

Testing LLM functionality

My program uses the OpenAI client to interface with various LLMs, including OpenRouter, Groq, and GPT. To test this functionality, we were encouraged to use an HTTP mocking library like Nock, but it made more sense to me to mock the OpenAI client in Jest using jest.mock. In order to do this, I had to move the initialization for the OpenAI client to a separate file so that the instance could be imported and mocked in tests.

Learning from tests

While writing my tests, I ended up learning a few things about how my own code worked. For example, I was passing multi-line template literals in my prompt to the LLM, and when testing the prompt building function, I learned that all of the indentation and newlines were being passed to the LLM. After a bit of research I learned the newlines can be escaped with a \ like in a Unix shell, but as for the indentation, there's not much of a choice except to remove all indentation from the literal.

Image description

I was using node:fs.stat() to check whether a config file existed before parsing it with node:fs.readFile() and it turned out this was redundant because they both throw the same error if the file doesn't exist.

Image description

I have a module that selects a default model based on a provider base URL specified in an environment variable. While writing tests for it I was confused by my own logic writing the module. After thinking about it from the perspective of possible test scenarios, I was able to simplify the logic a fair bit which also made it much easier to understand.

Image description

Also, when trying to set values on process.env before each test in order to test the model selection module, I noticed that values on process.env that were set to undefined or null would evaluate as truthy. I'm not sure why, but I got around this by deleteing the values before each test.

I was worried that testing streamed responses from the LLM would be difficult, so without attempting it, I decided to make non-streamed responses the default option and create a flag to request a streamed response. But I was able to test streamed responses successfully - for most tests, I was able to use arrays instead of streams. To test for when reading the stream fails, I used a generator function.

  test("Should throw if error occurs reading response stream", async () => {
    const errorCompletion = (async function* () {
      yield new Error("Stream error");
    })();

    const exitSpy = jest.spyOn(process, "exit").mockImplementation();

    await writeOutput(errorCompletion, "output.txt", true, true);
    expect(exitSpy).toHaveBeenCalledWith(23);

    exitSpy.mockRestore();
  });
Enter fullscreen mode Exit fullscreen mode

While writing tests for the function that handles writing output, my tests helped me catch another edge case I'd missed while hastily making streamed responses optional: handling token usage for non-streamed responses.

  if (streamResponse) {
    await processCompletionStream(
      completion,
      outputFilePath,
      tokenUsageRequested,
      tokenUsage,
    );
  } else {
    // Forgot this part until I realized while writing my tests!
    const {
      prompt_tokens = 0,
      completion_tokens = 0,
      total_tokens = 0,
    } = completion?.usage || {};
    tokenUsage = { prompt_tokens, completion_tokens, total_tokens };
    if (outputFilePath) {
      await fs.writeFile(outputFilePath, completion.choices[0].message.content);
    } else {
      process.stdout.write(completion.choices[0].message.content);
    }
Enter fullscreen mode Exit fullscreen mode

Conclusion

Even though I've done testing before in other courses and was already familiar with Jest, I'd never read the docs thoroughly or used the mock functionality (working with servers in the past, I'd used superagent), so I learned a lot working on these tests. I think mocking and setup/teardown are incredibly useful features to have when writing tests.

I find testing to be invaluable to ensure no regressions are made to the codebase especially when working on a large project or in a team and it can save tons of time. For my own projects, I like to perform test-driven development, and intend to continue doing so in the future.

That's it for this post. Thanks for reading!

💖 💪 🙅 🚩
udayrana
Uday Rana

Posted on November 11, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related