Best Practices for Graceful shutdown in Azure Functions

shibayan

Tatsuro Shibamura

Posted on February 4, 2021

Best Practices for Graceful shutdown in Azure Functions

For serverless applications such as Azure Functions, Graceful shutdown is an important process to maintain the integrity of the application.

The reason is that serverless applications are frequently restarted. This is mainly due to platform updates and deployment of new versions of the application.

Have you ever thought of a situation where a restart occurs while a function is running? Learn the best practices of Graceful shutdown to avoid regrets.

Passing CancellationToken

CancellationToken provides a means to cancel general asynchronous processing in .NET. Azure Functions has built-in support for CancellationToken bindings.

CancellationToken, which can be obtained by binding, is embedded in the application lifecycle and can be used to easily detect shutdowns.

A very simple usage is explained with sample code.



public class Function1
{
    [FunctionName("Function1")]
    public async Task Run([TimerTrigger("0 */5 * * * *")] TimerInfo timer, ILogger log, CancellationToken cancellationToken)
    {
        try
        {
            log.LogInformation("Function executing");

            // Simulate time-consuming processes
            await Task.Delay(TimeSpan.FromSeconds(30), cancellationToken);

            log.LogInformation("Function executed");
        }
        catch (OperationCanceledException)
        {
            log.LogWarning("Function cancelled");
        }
    }
}


Enter fullscreen mode Exit fullscreen mode

Processes that take a long time are simulated with Task.Delay. Note that Task.Delay is passed a CancellationToken.

If you send Ctrl+C to send a shutdown signal while a Function is running, the time-consuming process is canceled and a OperationCanceledException exception is caught, and the log is output.

Cancelling function

The same process as when Ctrl+C is sent will be performed when Azure Functions are restarted or deployed, so by handling CancellationToken correctly, you can implement the termination process without breaking the integrity of your application.

With retry policy

CosmosDBTrigger and EventHubTrigger are recommended to be used in combination with retry policy, because they will advance the checkpoint if Functions fails.

In these triggers, after detecting a shutdown, the function will intentionally fail in order to retry.



public class Function2
{
    [FixedDelayRetry(-1, "00:00:10")]
    [FunctionName("Function2")]
    public async Task Run([CosmosDBTrigger(
                              databaseName: "HackAzure",
                              collectionName: "TodoItem",
                              ConnectionStringSetting = "CosmosConnection",
                              LeaseCollectionName = "Lease")]
                          IReadOnlyList<Document> input, ILogger log, CancellationToken cancellationToken)
    {
        if (input == null || input.Count <= 0)
        {
            return;
        }

        try
        {
            // Process change feed
            log.LogInformation("Function executing");

            foreach (var document in input)
            {
                log.LogInformation($"Id = {document.Id}");
            }

            // Simulate time-consuming process
            await Task.Delay(TimeSpan.FromSeconds(30), cancellationToken);

            log.LogInformation("Function executed");
        }
        catch (OperationCanceledException)
        {
            // Process for shutdown
            log.LogWarning("Function cancelled");

            // Rethrow exception (Keep current checkpoint)
            throw;
        }
    }
}


Enter fullscreen mode Exit fullscreen mode

This sample code catches the OperationCanceledException and rethrows the exception after performing the minimum necessary shutdown processing.

After the exception is rethrown, it is shut down, so there is no retry, but it is used to keep the current checkpoint.

First run (send Ctrl+C)

In the first run, the shutdown signal was sent during the CosmosDBTrigger processing, so an exception was thrown and the cancellation process was performed.

First run

Second run (retry policy)

Since the checkpoint is maintained by the retry policy, the second run will start from the continuation of the aborted Change Feed.

Second run

Combined with the retry policy, it has increased the tolerance to process interruptions.

The retry policy is a very useful feature that can improve the fault tolerance of Azure Functions, but the Functions must be implemented to be idempotency.

Shutdown process should be completed quickly

Graceful shutdown in Azure Functions was achieved with the right combination of CancellationToken and Retry policy.

The last thing to note is the existence of a wait time before shutdown. Unfortunately, it is currently fixed at 10 seconds. (Default value of shutdownTimeLimit in ASP.NET Core Module)

If this timeout period is exceeded, the system will be forced to shut down, making a time-consuming shutdown process impossible.

The fixed timeout of 10 seconds is not enough for some applications, so I propose to change it in the following GitHub issue.

Make Graceful Shutdown timeout to be configurable or increase default value #7103

What problem would the feature you're requesting solve? Please describe.

To prepare for interruptions caused by deploying functions that have been running for a long time, we use CancellationToken to detect them and perform the necessary processing for graceful shutdown.

However, there seems to be a case where the function is forcibly shutdown before it completes successfully.

Checking the log, it seems to depend on the shutdownTimeLimit value of AspNetCoreModule, since it is shut down after about 10 seconds.

https://docs.microsoft.com/en-us/aspnet/core/host-and-deploy/iis/web-config?view=aspnetcore-5.0#attributes-of-the-aspnetcore-element

Describe the solution you'd like

  1. Make the value of shutdownTimeLimit configurable
  2. Or increase the default value from 10 seconds.

At the moment, it is safer to implement the function in an idempotency and re-execute it with a retry policy.

Enjoy your Azure Serverless life!

💖 💪 🙅 🚩
shibayan
Tatsuro Shibamura

Posted on February 4, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related