AWS Step Functions In-Depth | Serverless
awedis
Posted on July 4, 2022
In this article we are going to learn about Step Functions, its main components, and will build some examples using serverless framework
The main parts of this article:
- About Finite-State Machine
- Step Functions (Main components)
- Examples
1. What is Finite-State Machine
The two keywords that you need to remember are States and Transitions. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition
2. Step Functions
Step Function is AWS-managed service that uses Finite-State Machine (FSM) model
Step Functions is an orchestrator that helps you to design and implement complex workflows. When we need to build a workflow or have multiple tasks that need to be orchestrated, Step Functions coordinates between those tasks. It simplifies overall architecture and provides us with much better control over each step of the workflow
Step Functions is built on two main concepts: Tasks and State Machine
All work in the state machine is done by tasks. A task performs work by using an activity or an AWS Lambda function, or passing parameters to the API actions of other services
- State Types It’s essential to remember that States aren’t the same thing as Tasks since Tasks are one of the State types. There are numerous State types, and all of them have a role to play in the overall workflow:
State Type should be one of these values:
- Task - Represents a single unit of work performed by a state machine
- Wait - Delays the state machine from continuing for a specified time
- Pass - Passes its input to its output, without performing work, Pass states are useful when constructing and debugging state machines
- Succeed - Stops an execution successfully
- Fail - Stops the execution of the state machine and marks it as a failure
- Choice - Adds branching logic to a state machine
- Parallel - Can be used to create parallel branches of execution in your state machine
- Map - Can be used to run a set of steps for each element of an input array. While the Parallel state executes multiple branches of steps using the same input, a Map state will execute the same steps for multiple entries of an array in the state input
3. Examples
In this part we are going to build 4 step functions
Note: Step Functions definition can be written in JSON or YAML
I- Example
The first example is a simple one we have 2 Lambda functions orchestrated, the first one is adding our input by 10, then passing it to the second Lambda which is later adding by 20, and our final result will be 40
Step Functions Definition:
firstLambdaARN: &FIRST_ARN arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-firstState
secondLambdaARN: &SECOND_ARN arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-secondState
states:
ExampleOneSF:
name: ExampleOneSF
definition:
Comment: "Example One"
StartAt: firstState
States:
firstState:
Type: Task
Resource: *FIRST_ARN
Next: secondState
secondState:
Type: Task
Resource: *SECOND_ARN
Next: success
success:
Type: Succeed
Note: here I'm using Alias inside the YAML file
The Two Lambda Functions (firstState & secondState):
module.exports.firstState = async (event) => {
console.log(event);
const {
value,
} = event;
const result = 10 + value;
return {
value: result
};
};
module.exports.secondState = async (event) => {
console.log(event);
const {
value,
} = event;
const result = 20 + value;
return {
value: result,
status: 'SUCCESS'
};
};
Inside routes:
firstState:
handler: src/modules/StepFunction/controller/exampleOne.firstState
timeout: 300
secondState:
handler: src/modules/StepFunction/controller/exampleOne.secondState
timeout: 300
The input:
{
value: 15
}
The final result is 45 since my input was 15, the first state added it 10 and the second one 20
As we can see it's very easy to pass data between states, this makes step functions very useful service to build decoupled architecture
II- Example
In example 2 we are going to imitate how to create a form upload, adding Choice Type. First we have a "validateForm" that will check for validation either it fails or continues and passes the data to the other Lambda "processForm" will do some process, then we have "uploadForm" which may finally write our data to database for example
Step Functions Definition:
ExampleTwoSF:
name: ExampleTwoSF
definition:
Comment: "Example Two"
StartAt: validateForm
States:
validateForm:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-validateForm
Next: isFormValidated
isFormValidated:
Type: Choice
Choices:
- Variable: "$.status"
StringEquals: SUCCESS
Next: processForm
- Variable: "$.status"
StringEquals: ERROR
Next: fail
processForm:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-processForm
Next: uploadForm
uploadForm:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-uploadForm
Next: isFormUploaded
isFormUploaded:
Type: Choice
Choices:
- Variable: "$.status"
StringEquals: SUCCESS
Next: success
- Variable: "$.status"
StringEquals: ERROR
Next: fail
success:
Type: Succeed
fail:
Type: Fail
Lambda Functions:
module.exports.validateForm = async (event) => {
console.log(event);
// validate form
// if not valid
// return {
// status: 'ERROR'
// };
return {
status: 'SUCCESS'
};
};
module.exports.processForm = async (event) => {
console.log(event);
// add simple process
return {
processData: 1000,
};
};
module.exports.uploadForm = async (event) => {
console.log(event);
// upload data for example to DynamoDB
return {
status: 'SUCCESS'
};
};
III- Example
In this example we are going to use the wait, pass and parallel Types. After the user uploads his/her profile, we wait 5 seconds if all is good it notifies the user by running two Lambda functions in parallel, one to send an Email and one for SMS, in addition we can see there is a Pass type state that just adds some data that I defined (admin details...) and passes to the Parallel State
Note: I tried to build some simple examples relating to real world features, this may vary based on your needs
Step Functions Definition:
ExampleThreeSF:
name: ExampleThreeSF
definition:
Comment: "Example Three"
StartAt: uploadProfile
States:
uploadProfile:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-uploadProfile
Next: waitFiveSeconds
waitFiveSeconds:
Type: Wait
Seconds: 5
Next: addAdminPayload
addAdminPayload:
Type: Pass
Result:
admin_name: "Admin Name"
admin_phone: "Admin Number"
admin_email: "Admin Email"
ResultPath: "$.adminDetails"
Next: notifyCustomer
notifyCustomer:
Type: Parallel
End: true
Branches:
- StartAt: sendEmail
States:
sendEmail:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-sendEmail
End: true
- StartAt: sendSMS
States:
sendSMS:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-sendSMS
End: true
Note: Parallel states are used when different types of tasks, workflows need to be performed concurrently
Lambda Functions:
module.exports.uploadProfile = async (event) => {
console.log(event);
const {
data,
} = event;
return {
status: "Profile uploaded",
data,
};
};
module.exports.sendEmail = async (event) => {
console.log(event);
// Send Email
return {
status: `Email sent to ${event.data.email}`,
};
};
module.exports.sendSMS = async (event) => {
console.log(event);
// Send SMS
return {
status: `SMS Sent to ${event.data.phone}`,
};
};
As we can see the two Lambda tasks are taking same input, however each one of them is working based on a specific attributes from that data, for example my sendEmail needs the email value, whereas the sendSMS needs the phone details
Input:
{
data: {
name: 'Test User',
email: 'test@test.com',
phone: '0123456789'
}
}
If any branch fails, the entire Parallel state is considered to have failed. If error is not handled by the Parallel state itself, Step Functions stops the execution with an error
IV- Example
In this example we are going to use Map type, which can be used to run a set of steps for each element of an input array. Map state provides us the capability of running multiple sequential workflows in parallel
I'm going to run 4 concurrent workflows all of them are going to do same business logic, my first state concatenates the orderID with ID ID-${orderID}
and my second state returns a message this is order number ${orderID}, which has ${quantity} orders
Step Functions Definition:
ExampleFourSF:
name: ExampleFourSF
definition:
Comment: "Example Four"
StartAt: uploadData
States:
uploadData:
Type: Map
InputPath: "$.detail"
ItemsPath: "$.data"
MaxConcurrency: 4
Iterator:
StartAt: manipulateObject
States:
manipulateObject:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-manipulateObject
Next: createFinalObject
createFinalObject:
Type: Task
Resource: arn:aws:lambda:${env:region}:${env:accountId}:function:${self:service}-${env:stage}-createFinalObject
End: true
ResultPath: "$.detail.data"
End: true
Note:
InputPath: to select a subset of the input
ItemsPath: to specify a location in the input to find the JSON array to use for iterations
MaxConcurrency: how many invocations of the Iterator may run in parallel
Lambda Functions:
module.exports.manipulateObject = async (event) => {
console.log(event);
const {
orderID,
quantity,
} = event;
return {
orderID_manipulated: `ID-${orderID}`,
quantity,
};
};
module.exports.createFinalObject = async (event) => {
console.log(event);
const {
orderID_manipulated,
quantity,
} = event;
const status = `this is order number ${orderID_manipulated}, which has ${quantity} orders`;
return {
status,
};
};
The input:
{
detail: {
title: "My fourth example",
data: [
{ orderID: 1, quantity: 10 },
{ orderID: 2, quantity: 24 },
{ orderID: 3, quantity: 32 },
{ orderID: 4, quantity: 5 },
]
}
}
The code that I am triggering my Step Functions:
const { StepFunctions } = require('aws-sdk');
const stepFunctions = new StepFunctions();
module.exports.get = async (event) => {
try {
const stepFunctionResult = stepFunctions.startExecution({
stateMachineArn: process.env.EXAMPLE_FOUR_STEP_FUNCTION_ARN,
input: JSON.stringify({
detail: {
title: "My fourth example",
data: [
{ orderID: 1, quantity: 10 },
{ orderID: 2, quantity: 24 },
{ orderID: 3, quantity: 32 },
{ orderID: 4, quantity: 5 },
]
}
}),
}).promise();
console.log('stepFunctionResult =>', stepFunctionResult);
return {
statusCode: 200,
body: JSON.stringify({
message: `This is test API`,
}, null, 2),
};
} catch (error) {
console.log(error);
}
};
And finally you need to always make sure your step functions are handling the errors. Any state can encounter runtime errors. Errors can happen for various reasons. By default, when a state reports an error, AWS Step Functions causes the execution to fail entirely. For more about error handling you can visit this link
Conclusion
Step Functions are very useful service, it helps you to build a complex features, decouple your code, and create orchestrated services
Through the examples above I tried to showcase some real world features how can be made, you can end up making thousand of different workflows based on your requirements
For more articles like this and in order to keep on track with me, you can always follow me on LinkedIn
Posted on July 4, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.