How to integrate AWS Step Functions with ECS
Irene Aguilar
Posted on June 1, 2023
Introduction
In this article we will explain how to integrate AWS Step Functions with ECS and how by adding other services (AWS Lambda functions) you can achieve a completely serverless solution for orchestrating data and services running in containers.
What is AWS Step Functions?
AWS Step Functions is a serverless orchestration service that allows you to define a series of event-driven steps to create a workflow. You can manage AWS Lambda functions and other AWS services to create a distributed application as if it were a state machine.
This time we are going to talk about the integration of Lambdas with Amazon Elastic Container Service (Amazon ECS) in its serverless mode, Fargate, to define a flow in which given the input data we decide what task we are going to execute, what command we are going to pass to the container, wait for it to finish and save the logs of the execution in an S3 folder.
Getting started: creating step functions
The first thing we have to do is to create our step functions, we will do it through the console to make it more visual. Speaking of visualising, AWS has the option to design a workflow in a completely visual way (drag and drop) called Workflow Studio, but this is the subject of another post.
We select the option "Write your workflow in code" and the first decision we have to make is the type of state machine we want Standard or Express:
When looking at the characteristics of each of the types, on this occasion, we selected the Standard type as our containers can be running for more than 5 minutes, so the Express type is discarded.
For the definition of our workflow we have to use the amazon state language and to test and better understand the data flow and how it is passed between the different steps we can use the data flow simulator that amazon provides:
Our example would look like this:
What is AWS ECS?
Amazon Elastic Container Service (Amazon ECS) is a fully managed container control service. Amazon ECS leverages AWS Fargate serverless technology to provide autonomous container operations, reducing configuration, security, and patching time. It integrates easily with the rest of the AWS platform services to build secure, easy-to-use solutions.
ECS: cluster and task definitions
The ECS cluster and task definition has to be already created before the integration.
The cluster is created at region level and is needed to group container instances on which to run tasks.
Task definitions specify the application container information. You can have one or more containers (for example, you can add the X-Ray Daemon for traceability and you can select whether you want to run it in Fargate mode (AWS managed infrastructure) or EC2 mode, it also includes the ability to use it with on-premises infrastructure.
Integration with ECS
With our ECS already created, we focused on the integration with ECS that would have this aspect within the definition of our state machine:
"image_1": {
"Next": "task_finished_choice_step",
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error_result",
"Next": "handle_error_step"
}
],
"Type": "Task",
"Comment": "It runs a ECS task with scenarios mode image_1 image",
"InputPath": "$.lambda_result.next_stage",
"ResultPath": "$.image_1_result",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"Cluster": "arn:aws:ecs:{region}:{account_id}:cluster/ifgeek-ecs-cluster",
"TaskDefinition": "image_1-task-name",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets": [
"{subnet_1}",
"{subnet_2}",
"{subnet_3}"
],
"SecurityGroups": [
"{sg-1}"
]
}
},
"Overrides": {
"ContainerOverrides": [
{
"Name": "image_1-container",
"Command.$": "$.command",
"Environment": [{
"Name": "image_2_USER",
"Value": "$.user"
}]
}
]
},
"LaunchType": "FARGATE",
"PlatformVersion": "LATEST"
}
}
The most important fields in this integration are the following:
Define the status of type "Task", this represents a single unit of work.
InputPath is the input data that is transmitted from the lambda (plan_next_step), where it is decided which ECS task has to be executed, what will be its execution command and the necessary environment variables are configured. Note that in the InputPath field value ("InputPath": "$.lambda_result.next_stage" ) we have used JsonPath to transfer the values to the ECS task input:
{
"name": "image_1",
"input": {
"input_file": "input/uploads/example.jpg",
"lambda_result": {
"total_stages_count": 1,
"next_stage": {
"image": "image_1",
"command": ["echo", "hello", "world"],
"user": "ifgeek"
},
"processed_stages_count": 1,
"config": []
}
},
"inputDetails": {
"truncated": false
}
}
"Resource": "arn:aws:states:::ecs:runTask.sync": Indicates that the integration is with ECS and that a runTask is executed when it reaches this step and waits for it to finish. There is another resource type: "arn:aws:states:::ecs:runTask.waitForTaskToken" which executes the ECS task and then waits for the task token to be returned.
In Parameters we have to define the configuration of our ECS cluster and the TaskDefinition we want to run in addition to the network configuration which is always convenient in productive environments to configure several subnets to have multi AZ.
However, the most interesting field in terms of configuration is the "overrides" field that allows us to overwrite the configuration and more specifically the "ContainerOverride" field that overwrites the command with which the container was defined in the TaskDefinition. It can also be used to modify the values of environment variables, which offers a way to change the configuration quickly and with many possibilities.
- "LaunchType" can be of type "FARGATE" or of type EC2 for our solution we don't need to have a container running continuously so we opted for the serverless solution with Fargate.
In our case we developed an api gateway to be able to invoke our state machine through an api, but you can also start an execution from the Step Functions console itself.
During execution, you can check which step of the state machine you are in, the input and output of the previous steps and whether it finished successfully or failed.
Examples of successful and failed execution:
In addition to visually, a table is displayed with all the statuses, elapsed time and in all integrations the links to the services are displayed to make traceability easier and steps functions are delegated to check if the container being executed has finished correctly or not (check if it has had an exit code different from 0), which makes it easier to manage and control errors.
Detail of the command we are executing:
Execution completed with exit code 1:
Conclusion
As we have seen, the integration of services with AWS Step Functions and more specifically with ECS Fargate is quick and easy to start using. From this basic design it is possible to enrich it with more services such as sending notifications with Amazon SNS. You can also exploit the api of Step Functions and ECS and develop endpoints to track the steps of the state machine, it is already to adapt it to each use case, imagination is the limit!
Posted on June 1, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.