Optimizing Node.js Code Coverage with NYC in Docker Containers

keploy

keploy

Posted on August 22, 2024

Optimizing Node.js Code Coverage with NYC in Docker Containers

Image description

There are particular difficulties in getting NYC coverage from Node.js operating in Docker containers. This blog discusses the procedures needed to operate Node.js clusters in Docker and produce reliable code coverage reports.

How to Get Coverage nativally?

Running NYC locally is straightforward with a simple command:



"start": "nyc node ./dist/main.js",


Enter fullscreen mode Exit fullscreen mode

However, there are more stages involved in getting support within Docker. Let’s explore the nuances of this.

What are the Challenges with Signal Handling in Docker Containers ?

Sending a docker an interrupt signal is quite a deal in our case. Reason? Instead of running a binary I’m running a script(to talk about later)

You see my dear readers, what I’ve to face? my docker container doesn’t die, but why? Its because of the signal Handling in Docker and shell behaviour.

  1. PID 1 in Containers:In a Docker container, the CMD or ENTRYPOINT process runs as PID 1. This process is responsible for handling system signals.When using sh or bash to run a script, sh becomes the PID 1 process inside the container.

  2. Signal Forwarding: The shell (sh) does not automatically forward signals to child processes. Instead, it handles the signals itself, often not propagating them to the processes it spawns.

    When you press Ctrl+C, Docker sends a SIGINT signal to the PID 1 process. If sh is the PID 1 process, it might not forward the signal to the actual application process (e.g., node, yarn start).

  3. Zombie Processes: If the shell does not correctly forward the signal to the child processes, those processes continue running. This can lead to zombie processes and an unresponsive container.

How to solve this? A simple trap signal and passing it to child processes should work fine.



#!/bin/sh

set -a
. ".env.$ENVIRONMENT_NAME"
set +a

# Function to handle SIGINT signal
handle_int() {
    echo "SIGINT received, forwarding to child process..."
    kill -INT "$child" 2>/dev/null
    exit 0
}

# Trap SIGINT signal
trap 'handle_int' INT TERM

sleep 10  # Ensure services like DB are up

# Start yarn and get its PID
yarn start &

child=$!
echo "Started yarn with PID $child"

# Wait for child process to finish
wait "$child"

# Copy nyc output to host directory (optional, if you need it)


Enter fullscreen mode Exit fullscreen mode

From within the container this is what all processes look like:

But is this trap and send enough?No it is not.

It surely does kill the container but doesn’t do the graceful shutdown. It kills the PID 1 and moves ahead without child processes to wrap up, something we’re really looking here for.

Reason for this issue:-

  1. Shell Limitations: The default shell (sh) might not forward signals to child processes correctly. Instead, it might terminate itself, leaving child processes orphaned.

  2. Signal Propagation: The shell script (run.sh) directly receives the signal and may not propagate it to the yarn start process effectively.

Then how to send the signal to the yarn start which gets our coverage?

What are effective solutions for Handling Signals ?

We need to ensure that the default kill signal is converted to a stop signal, and signal is passed to child processes. Finally, we also intend to copy data from docker image to host machine. We can follow the below steps to handle these signals and ensure that the interrupt signal reaches the process groups.

  1. Using dumb-init for Better Signal Handling

  2. Handling Signal Traps in Shell Scripts

  3. Copying NYC Coverage Data from Docker Container

Using dumb-init for Better Signal Handling

Dumb-init is a simple process supervisor designed to run as PID 1 inside Docker containers. It handles reaping zombie processes and forwards signals to all processes in its process group.

Why dumb-init helps:

  • Signal Forwarding: When the Docker container receives a SIGINT signal, dumb-init forwards it to the shell script and all child processes.

  • Process Management: dumb-init ensures proper reaping of zombie processes, which the default shell might not handle well.



<STAGE-1-BUILD>
...
<STAGE-2-BUILD>
...

# Set working directory
ENTRYPOINT ["dumb-init", "--"]

# RUN echo "hi"
STOPSIGNAL SIGINT

# Set entrypoint and command
CMD [ "sh","./scripts/migrate-and-run.sh"]

# Expose port 9000
EXPOSE 9000


Enter fullscreen mode Exit fullscreen mode

Thus the usage of dumb-init helps us gracefully kill the yarn start and its subsequent workers.

Ensuring NYC Coverage Data is Written Correctly

One key mistake I made was handling NYC coverage across different locations: building the coverage in one place, copying it to another for execution, and reading the report in yet another. This inconsistency was problematic

How to handle Signal Traps in Shell Scripts ?

As you saw earlier, I run the build file, which includes build functions and maps them to my code. When running a multi-stage build, the directory may differ, yet the build still works.

In the example below, the application is built in the app-build directory. In the second stage of the Docker build, it is executed via the script. For building purposes, the exact location doesn't matter much, but for NYC, it does.

NYC relies on consistent paths to correctly map coverage data to the source files. Therefore, ensuring that the context of the build, execution, and reading paths remain the same is crucial for accurate NYC coverage reporting.

So when I try to get the coverage I see something like, reason? NYC doesn’t has the context of the code in the second stage of build, therefore I get no line coverage.

So our dockerfile needs the same path that’s in pre build vs post build something looks like



<STAGE-1-BUILD>

...
<STAGE-2-BUILD>
...

WORKDIR  /app-build
# Set working directory
ENTRYPOINT ["dumb-init", "--"]

# RUN echo "hi"
STOPSIGNAL SIGINT

# Set entrypoint and command
CMD [ "sh","./scripts/migrate-and-run.sh"]


Enter fullscreen mode Exit fullscreen mode

And our coverage in docker container looks like this:

Lets copy this data locally and see what happens:

But this isn’t what I saw in screenshot above. why is the coverage coming to be zero?

You see the build and working context of directory isn’t same as my local machine thus NYC isn’t able to fetch me the correct coverage. How to solve it?

Keep the current path of repository, build and the working directory same. As you see I take the path as argument and build over it to keep the context always same for my NYC to output correct data.



<STAGE-1-BUILD>
ENV APP_PATH=${APP_PATH:-/default/path} 
...
<STAGE-2-BUILD>
...
# Set working directory
WORKDIR ${APP_PATH}
ENTRYPOINT ["dumb-init", "--"]

# RUN echo "hi"
STOPSIGNAL SIGINT

# Set entrypoint and command
CMD [ "sh","./scripts/migrate-and-run.sh"]


Enter fullscreen mode Exit fullscreen mode

The command to build the docker image:



sudo -E docker build --build-arg ENVIRONMENT_NAME=docker --build-arg BUILD_NAME=docker --build-arg APP_PATH=$(pwd) -t custom_app .


Enter fullscreen mode Exit fullscreen mode

The result of this, when I copy the data on my local machine it gives me the desired output.

The importance of script

In script we catch the signal and handle it. If it wasn’t done we would see something like this ~

What just happened here now? You see when program is stopped NYC needs some time to write the final json file from memory to local folder. This extra small delay is something we won’t wait for. This causes this error.



#!/bin/sh
# Function to handle SIGINT signal
handle_int() {
    echo "SIGINT received, forwarding to child process..."
    kill -INT "$child" 2>/dev/null
    echo "Exiting after delay..."
    exit 0
}

# Trap SIGINT signal
trap 'handle_int' INT TERM

sleep 10  # Ensure services like DB are up

# Start yarn and get its PID
yarn start &

child=$!
echo "Started yarn with PID $child"

# Wait for child process to finish
wait "$child"

# Copy nyc output to host directory (optional, if you need it)


Enter fullscreen mode Exit fullscreen mode

To solve the above code and wait for child processes to die with an update handle function.



#!/bin/sh
# Function to handle SIGINT signal
handle_int() {
    echo "SIGINT received, forwarding to child process..."
    kill -INT "$child" 2>/dev/null
    echo "Waiting for child process to exit..."
    wait "$child"
    echo "Child process exited. Waiting for NYC coverage data to be fully written..."
    sleep 10  # Wait for 50 seconds
    echo "Exiting after delay..."
    exit 0
}

# Trap SIGINT signal
trap 'handle_int' INT TERM

sleep 10  # Ensure services like DB are up

# Start yarn and get its PID
yarn start &

child=$!
echo "Started yarn with PID $child"

# Wait for child process to finish
wait "$child"

# Copy nyc output to host directory (optional, if you need it)


Enter fullscreen mode Exit fullscreen mode

How to Copy NYC Coverage Data from Docker Container ?

Would a normal volume mounting work? Nope, it won’t because when you mount the file NYC clears the old files and then you will find some log like this

So either you can set the cleaning to be false something like this.



    "start": "nyc --clean=false --max-old-space-size=4096 --reporter=lcov --reporter=text  node ./dist/main.js",


Enter fullscreen mode Exit fullscreen mode

Or instead of this you can manually copy the file using docker command.



docker cp custom_apps:$(pwd)/.nyc_output $(pwd)/.nyc_output


Enter fullscreen mode Exit fullscreen mode

Conclusion

  1. Passing SIGINT to Container and Its Sub-Processes

    When running Node.js applications in Docker containers, it’s crucial to handle SIGINT signals correctly. Using a process manager like dumb-init ensures that SIGINT signals are forwarded to all child processes. This allows for a clean shutdown of the application, ensuring that all necessary cleanup tasks are performed.

  2. Setting the Correct Context of Location for NYC to Calculate Coverage

    NYC, the code coverage tool for Node.js, relies on having a consistent file path for both the build and execution phases. When using multi-stage Docker builds, make sure that the paths used in each stage are identical. This prevents issues with NYC being unable to locate source files, which can result in inaccurate coverage reports.

  3. Adding Delay in Docker Stop to Ensure Data is Copied Out

    Adding a delay when stopping a Docker container ensures that all data, such as NYC coverage reports, is fully written and can be copied out. This can be achieved by trapping SIGINT signals in the container’s entry script, forwarding the signal to child processes, and waiting for a short period before exiting.

  4. Incase of docker-compose handling the docker containers make sure to use below code in dockerfile.




STOPSIGNAL SIGINT


Enter fullscreen mode Exit fullscreen mode

The above code snippet will help you send the interrupt signal to the docker container so that the containers is interrupt instead of terminated.

Our final dockerfile looks something like this:



FROM node:14-alpine
# Set working directory
WORKDIR APP_PATH
COPY . .

ENTRYPOINT ["dumb-init", "--"]

STOPSIGNAL SIGINT

# Set entrypoint and command
CMD [ "sh","./scripts/migrate-and-run.sh"]

# Expose port 9000
EXPOSE 9000

Enter fullscreen mode Exit fullscreen mode




FAQs

Why doesn’t my Docker container stop when I press Ctrl+C?

This often happens because the shell script running as PID 1 in the container doesn’t propagate the signal to child processes. The parent process terminates but leaves the child processes running, leading to zombie processes. Using a process manager like dumb-init ensures proper signal forwarding and process management.

How do I ensure that NYC code coverage reports are generated correctly in Docker?

Ensure that the file paths in the Docker container are consistent between the build and runtime stages. NYC relies on file paths to map coverage data to source files. By keeping the paths consistent, NYC can accurately calculate coverage.

What is dumb-init, and why should I use it?

dumb-init is a simple process manager designed to run as PID 1 inside containers. It forwards signals to all child processes and reaps zombie processes. This ensures clean shutdowns for Node.js applications in Docker and that all cleanup tasks, including NYC coverage report generation, are completed before the container stops.

How do I prevent NYC from clearing coverage data when I stop a Docker container?

You can set the --clean=false flag in the NYC command to prevent it from clearing coverage data. Alternatively, you can manually copy the coverage files using Docker commands, ensuring that the coverage data is preserved.

What is the correct way to handle signals in Docker when using a shell script?

You should trap signals like SIGINT in your shell script and forward them to child processes. This can be done by using a signal handler that sends the signal to child processes and waits for them to finish before exiting. This ensures proper shutdown and coverage report generation.

How do I ensure that NYC coverage data is saved before a Docker container exits?

Add a small delay in your signal handler before the container shuts down to ensure all NYC coverage data is fully written. This delay gives NYC the time it needs to save data from memory to disk, preventing incomplete coverage reports.

💖 💪 🙅 🚩
keploy
keploy

Posted on August 22, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related