Best Practices for Graceful shutdown in Azure Functions
Tatsuro Shibamura
Posted on February 4, 2021
For serverless applications such as Azure Functions, Graceful shutdown is an important process to maintain the integrity of the application.
The reason is that serverless applications are frequently restarted. This is mainly due to platform updates and deployment of new versions of the application.
Have you ever thought of a situation where a restart occurs while a function is running? Learn the best practices of Graceful shutdown to avoid regrets.
Passing CancellationToken
CancellationToken
provides a means to cancel general asynchronous processing in .NET. Azure Functions has built-in support for CancellationToken
bindings.
CancellationToken
, which can be obtained by binding, is embedded in the application lifecycle and can be used to easily detect shutdowns.
A very simple usage is explained with sample code.
public class Function1
{
[FunctionName("Function1")]
public async Task Run([TimerTrigger("0 */5 * * * *")] TimerInfo timer, ILogger log, CancellationToken cancellationToken)
{
try
{
log.LogInformation("Function executing");
// Simulate time-consuming processes
await Task.Delay(TimeSpan.FromSeconds(30), cancellationToken);
log.LogInformation("Function executed");
}
catch (OperationCanceledException)
{
log.LogWarning("Function cancelled");
}
}
}
Processes that take a long time are simulated with Task.Delay
. Note that Task.Delay
is passed a CancellationToken
.
If you send Ctrl+C
to send a shutdown signal while a Function is running, the time-consuming process is canceled and a OperationCanceledException
exception is caught, and the log is output.
The same process as when Ctrl+C
is sent will be performed when Azure Functions are restarted or deployed, so by handling CancellationToken
correctly, you can implement the termination process without breaking the integrity of your application.
With retry policy
CosmosDBTrigger
and EventHubTrigger
are recommended to be used in combination with retry policy, because they will advance the checkpoint if Functions fails.
A quick review of the Azure Functions new feature "Retry Policy"
Tatsuro Shibamura ・ Nov 5 '20
In these triggers, after detecting a shutdown, the function will intentionally fail in order to retry.
public class Function2
{
[FixedDelayRetry(-1, "00:00:10")]
[FunctionName("Function2")]
public async Task Run([CosmosDBTrigger(
databaseName: "HackAzure",
collectionName: "TodoItem",
ConnectionStringSetting = "CosmosConnection",
LeaseCollectionName = "Lease")]
IReadOnlyList<Document> input, ILogger log, CancellationToken cancellationToken)
{
if (input == null || input.Count <= 0)
{
return;
}
try
{
// Process change feed
log.LogInformation("Function executing");
foreach (var document in input)
{
log.LogInformation($"Id = {document.Id}");
}
// Simulate time-consuming process
await Task.Delay(TimeSpan.FromSeconds(30), cancellationToken);
log.LogInformation("Function executed");
}
catch (OperationCanceledException)
{
// Process for shutdown
log.LogWarning("Function cancelled");
// Rethrow exception (Keep current checkpoint)
throw;
}
}
}
This sample code catches the OperationCanceledException
and rethrows the exception after performing the minimum necessary shutdown processing.
After the exception is rethrown, it is shut down, so there is no retry, but it is used to keep the current checkpoint.
First run (send Ctrl+C)
In the first run, the shutdown signal was sent during the CosmosDBTrigger
processing, so an exception was thrown and the cancellation process was performed.
Second run (retry policy)
Since the checkpoint is maintained by the retry policy, the second run will start from the continuation of the aborted Change Feed.
Combined with the retry policy, it has increased the tolerance to process interruptions.
The retry policy is a very useful feature that can improve the fault tolerance of Azure Functions, but the Functions must be implemented to be idempotency.
Shutdown process should be completed quickly
Graceful shutdown in Azure Functions was achieved with the right combination of CancellationToken and Retry policy.
The last thing to note is the existence of a wait time before shutdown. Unfortunately, it is currently fixed at 10 seconds. (Default value of shutdownTimeLimit
in ASP.NET Core Module)
If this timeout period is exceeded, the system will be forced to shut down, making a time-consuming shutdown process impossible.
The fixed timeout of 10 seconds is not enough for some applications, so I propose to change it in the following GitHub issue.
Make Graceful Shutdown timeout to be configurable or increase default value #7103
What problem would the feature you're requesting solve? Please describe.
To prepare for interruptions caused by deploying functions that have been running for a long time, we use CancellationToken
to detect them and perform the necessary processing for graceful shutdown.
However, there seems to be a case where the function is forcibly shutdown before it completes successfully.
Checking the log, it seems to depend on the shutdownTimeLimit
value of AspNetCoreModule, since it is shut down after about 10 seconds.
Describe the solution you'd like
- Make the value of
shutdownTimeLimit
configurable - Or increase the default value from 10 seconds.
At the moment, it is safer to implement the function in an idempotency and re-execute it with a retry policy.
Enjoy your Azure Serverless life!
Posted on February 4, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
September 13, 2020