SQS depth based ECS task auto-scaling using step scaling.
Muhammad Ahmad Khan
Posted on November 20, 2022
As mentioned in a previous blog post here, we can easily apply scaling using step scaling on SQS queue depth.
Step scaling policies increase or decrease the current capacity of a scalable target based on a set of scaling adjustments, known as step adjustments. The adjustments vary based on the size of the alarm breach. All alarms that are breached are evaluated by Application Auto Scaling as it receives the alarm messages.
With step scaling, you choose scaling metrics and threshold values for the CloudWatch alarms that trigger the scaling process and thus it requires you to create CloudWatch alarms.
When you create a step scaling policy, you add one or more step adjustments that enable you to scale based on the size of the alarm breach. Each step adjustment specifies the following:
- A lower bound for the metric value
- An upper bound for the metric value
- The amount by which to scale, based on the scaling adjustment type
Application Auto Scaling supports the following adjustment types for step scaling policies:
ChangeInCapacity—Increase or decrease the current capacity of the scalable target by the specified value. A positive value increases the capacity and a negative value decreases the capacity. For example: If the current capacity is 3 and the adjustment is 5, then Application Auto Scaling adds 5 to the capacity for a total of 8.
ExactCapacity—Change the current capacity of the scalable target to the specified value. Specify a positive value with this adjustment type. For example: If the current capacity is 3 and the adjustment is 5, then Application Auto Scaling changes the capacity to 5.
PercentChangeInCapacity—Increase or decrease the current capacity of the scalable target by the specified percentage. A positive value increases the capacity and a negative value decreases the capacity. For example: If the current capacity is 10 and the adjustment is 10 percent, then Application Auto Scaling adds 1 to the capacity for a total of 11.
Below is the CloudFormation template with the implementation of step scaling with SQS.
AWSTemplateFormatVersion: 2010-09-09
Description: Creates a fargate based auto-scaling environment that processes work from an SQS queue
Parameters:
DockerImageUrl:
Type: String
Default: latest
DockerContainerName:
Type: String
Default: consumer-service
EnvironmentName:
Type: String
Default: dev
Memory:
Type: String
Default: 8GB
Cpu:
Type: Number
Default: 2048 # 2 vCPU
ContainerPort:
Type: Number
Default: 3000
HealthCheckPath:
Type: String
Default: http://localhost:3000/check
FaragateScalingEnvSSM:
Type: AWS::SSM::Parameter::Value<String>
Default: "/config/ecs/FARGATE_SCALING_ENV"
QueueDepthScaleOutAlarmThresholdSSM:
Type: AWS::SSM::Parameter::Value<String>
Default: "/config/ecs/consumer-service/QUEUE_DEPTH_SCALE_OUT_ALARM_THRESHOLD"
CpuUtilizationScaleInAlarmThresholdSSM:
Type: AWS::SSM::Parameter::Value<String>
Default: "/config/ecs/consumer-service/CPU_UTILIZATION_SCALE_IN_ALARM_THRESHOLD"
CpuUtilizationNoComputeOrScaleInAlarmEvaluationPeriodsSSM:
Type: AWS::SSM::Parameter::Value<String>
Default: "/config/ecs/consumer-service/CPU_UTILIZATION_NO_COMPUTE_OR_SCALE_IN_ALARM_EVALUATION_PERIODS"
ComputeAutoScalingTargetMaxCapacitySSM:
Type: AWS::SSM::Parameter::Value<String>
Default: "/config/ecs/consumer-service/AUTO_SCALING_TARGET_MAX_CAPACITY"
Conditions:
CreateNonProdResources: !Equals [!Ref FaragateScalingEnvSSM, 'non-prod']
CreateProdResources: !Equals [!Ref FaragateScalingEnvSSM, 'prod']
Resources:
SQSQueue:
Type: 'AWS::SQS::Queue'
# Properties:
# ReceiveMessageWaitTimeSeconds: 20
# VisibilityTimeout: 1200 # 20 minutes
# MessageRetentionPeriod: 1209600 # 14 Days
QueueUrlParameter:
Type: 'AWS::SSM::Parameter'
Properties:
Name: !Join
- ''
- - /
- !Ref EnvironmentName
- /services/
- !Ref DockerContainerName
- /SQS_QUEUE_URL
Type: String
Value: !Ref SQSQueue
ComputeTaskLogGroup:
Type: 'AWS::Logs::LogGroup'
Properties:
LogGroupName: !Join
- /
- - /x-org
- ecs
- !Sub '${AWS::StackName}'
- logs
ComputeTaskRole:
Type: 'AWS::IAM::Role'
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- ecs-tasks.amazonaws.com
Action:
- 'sts:AssumeRole'
Policies:
- PolicyName: Required_Access
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- 'sqs:*'
- 'secretsmanager:*'
- 'ssm:*'
- 'logs:*'
- 'dynamodb:*'
- 's3:*'
- 'ecs:*'
Resource: '*'
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy'
ComputeTaskDefinition:
Type: 'AWS::ECS::TaskDefinition'
DependsOn: ComputeTaskLogGroup
Properties:
TaskRoleArn: !GetAtt ComputeTaskRole.Arn
ExecutionRoleArn: !GetAtt ComputeTaskRole.Arn
RequiresCompatibilities:
- FARGATE
NetworkMode: awsvpc
Cpu: !Ref Cpu
Memory: !Ref Memory
ContainerDefinitions:
- Name: !Sub '${AWS::StackName}'
Image: !Ref DockerImageUrl
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: us-east-1
awslogs-group: !Ref ComputeTaskLogGroup
awslogs-stream-prefix: ecs
HealthCheck:
Command:
- CMD-SHELL
- !Sub 'curl -f ${HealthCheckPath} || exit 1'
Interval: 30
Retries: 3
StartPeriod: 300
PortMappings:
- ContainerPort: !Ref ContainerPort
Protocol: tcp
Environment:
- Name: EnvironmentName
Value: !Ref EnvironmentName
- Name: SQS_QUEUE_URL
Value: !Ref SQSQueue
ComputeCluster:
Type: 'AWS::ECS::Cluster'
# Properties:
# ClusterName: !Join ['-', [!Ref DockerContainerName, cluster]]
NonProdComputeService:
Type: 'AWS::ECS::Service'
Condition: CreateNonProdResources # only create if it is NonProd env
Properties:
Cluster: !Ref ComputeCluster
TaskDefinition: !Ref ComputeTaskDefinition
DeploymentConfiguration:
MinimumHealthyPercent: 100
MaximumPercent: 200
# Desired count should be 0; Otherwise the Task Scheduler will restart number of desired containers once they are stopped
DesiredCount: 0
# This may need to be adjusted if the container takes a while to start up
# HealthCheckGracePeriodSeconds: 30
LaunchType: FARGATE
NetworkConfiguration:
AwsvpcConfiguration:
# Change it to DISABLED if you're using private subnets that have access to a NAT gateway
AssignPublicIp: ENABLED
Subnets:
- !ImportValue ComputeSubnetA
- !ImportValue ComputeSubnetB
- !ImportValue ComputeSubnetC
SecurityGroups:
- !ImportValue ComputeSecurityGroup
ProdComputeService:
Type: 'AWS::ECS::Service'
Condition: CreateProdResources # only create if it is Prod env
Properties:
Cluster: !Ref ComputeCluster
TaskDefinition: !Ref ComputeTaskDefinition
DeploymentConfiguration:
MinimumHealthyPercent: 100
MaximumPercent: 200
DesiredCount: 1
# This may need to be adjusted if the container takes a while to start up
# HealthCheckGracePeriodSeconds: 30
LaunchType: FARGATE
NetworkConfiguration:
AwsvpcConfiguration:
# Change it to DISABLED if you're using private subnets that have access to a NAT gateway
AssignPublicIp: ENABLED
Subnets:
- !ImportValue ComputeSubnetA
- !ImportValue ComputeSubnetB
- !ImportValue ComputeSubnetC
SecurityGroups:
- !ImportValue ComputeSecurityGroup
ComputeAutoScalingRole:
Type: 'AWS::IAM::Role'
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- ecs-tasks.amazonaws.com
- application-autoscaling.amazonaws.com
Action:
- 'sts:AssumeRole'
Path: "/"
Policies:
- PolicyName: !Sub ${DockerContainerName}-ECSAutoScalingRole
PolicyDocument:
Statement:
- Effect: Allow
Action:
- ecs:UpdateService
- ecs:DescribeServices
- application-autoscaling:*
- cloudwatch:DescribeAlarms
- cloudwatch:GetMetricStatistics
Resource: "*"
ManagedPolicyArns:
- 'arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceAutoscaleRole'
NonProdComputeAutoScalingTarget:
Type: 'AWS::ApplicationAutoScaling::ScalableTarget'
Condition: CreateNonProdResources # only create if it is NonProd env
Properties:
MinCapacity: 0 # As desired task can be 0
MaxCapacity: !Ref ComputeAutoScalingTargetMaxCapacitySSM
ResourceId: !Join
- '/'
- - service
- !Ref ComputeCluster
- !GetAtt NonProdComputeService.Name
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
RoleARN: !GetAtt ComputeAutoScalingRole.Arn
ProdComputeAutoScalingTarget:
Type: 'AWS::ApplicationAutoScaling::ScalableTarget'
Condition: CreateProdResources # only create if it is Prod env
Properties:
MinCapacity: 1 # As desired task can be 1 but not 0
MaxCapacity: !Ref ComputeAutoScalingTargetMaxCapacitySSM
ResourceId: !Join
- '/'
- - service
- !Ref ComputeCluster
- !GetAtt ProdComputeService.Name
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
RoleARN: !GetAtt ComputeAutoScalingRole.Arn
# https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html
NonProdNoComputeAutoScalingPolicy:
Type: 'AWS::ApplicationAutoScaling::ScalingPolicy'
Condition: CreateNonProdResources # only create if it is NonProd env
Properties:
PolicyName: !Sub ${DockerContainerName}-NonProdNoComputeAutoScalingPolicy
PolicyType: StepScaling
ScalingTargetId: !Ref NonProdComputeAutoScalingTarget
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
StepScalingPolicyConfiguration:
AdjustmentType: ExactCapacity # Can use PercentChangeInCapacity but then need to come up with configuration including some estimated change in percent
Cooldown: 60
MetricAggregationType: Average # Valid values are Minimum, Maximum, and Average. If the aggregation type is null, the value is treated as Average.
StepAdjustments:
- MetricIntervalLowerBound: !Ref AWS::NoValue
MetricIntervalUpperBound: 0
ScalingAdjustment: 0
# https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html
NonProdInitialComputeAutoScalingPolicy:
Type: 'AWS::ApplicationAutoScaling::ScalingPolicy'
Condition: CreateNonProdResources # only create if it is NonProd env
Properties:
PolicyName: !Sub ${DockerContainerName}-NonProdInitialComputeAutoScalingPolicy
PolicyType: StepScaling
ScalingTargetId: !Ref NonProdComputeAutoScalingTarget
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
StepScalingPolicyConfiguration:
AdjustmentType: ChangeInCapacity # ChangeInCapacity —> Increase or decrease the current capacity of the scalable target by the specified value
Cooldown: 60 # 1 min delay
MetricAggregationType: Minimum # Valid values are Minimum, Maximum, and Average. If the aggregation type is null, the value is treated as Average.
StepAdjustments:
- MetricIntervalLowerBound: 0
MetricIntervalUpperBound: !Ref AWS::NoValue
ScalingAdjustment: 1 # scaling up by 1 container when the alarm is greater than or equal to the Metric Threshold
# https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html
NonProdComputeAutoScalingScaleOutPolicy:
Type: 'AWS::ApplicationAutoScaling::ScalingPolicy'
Condition: CreateNonProdResources # only create if it is NonProd env
Properties:
PolicyName: !Sub ${DockerContainerName}-NonProdComputeAutoScalingScaleOutPolicy
PolicyType: StepScaling
ScalingTargetId: !Ref NonProdComputeAutoScalingTarget
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
StepScalingPolicyConfiguration:
AdjustmentType: ChangeInCapacity # ChangeInCapacity —> Increase or decrease the current capacity of the scalable target by the specified value
Cooldown: 60 # 1 min delay
MetricAggregationType: Minimum # Valid values are Minimum, Maximum, and Average. If the aggregation type is null, the value is treated as Average.
StepAdjustments:
- MetricIntervalLowerBound: 0 # 0 means exactly equal to Metric Threshold which is 10 defined using SSM parameter
MetricIntervalUpperBound: 15 # [Metrice Threshold + 15]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 15 # [Metrice Threshold + 15]
MetricIntervalUpperBound: 25 # [Metrice Threshold + 25]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 25 # [Metrice Threshold + 25]
MetricIntervalUpperBound: 35 # [Metrice Threshold + 35]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 35 # [Metrice Threshold + 35]
MetricIntervalUpperBound: 45 # [Metrice Threshold + 45]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 45 # [Metrice Threshold + 45]
ScalingAdjustment: 1
# https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html
ProdComputeAutoScalingScaleInPolicy:
Type: 'AWS::ApplicationAutoScaling::ScalingPolicy'
Condition: CreateProdResources # only create if it is Prod env
Properties:
PolicyName: !Sub ${DockerContainerName}-ProdComputeAutoScalingScaleInPolicy
PolicyType: StepScaling
ScalingTargetId: !Ref ProdComputeAutoScalingTarget
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
StepScalingPolicyConfiguration:
AdjustmentType: ExactCapacity # Can use PercentChangeInCapacity but then need to come up with configuration including some estimated change in percent
Cooldown: 60
MetricAggregationType: Average # Valid values are Minimum, Maximum, and Average. If the aggregation type is null, the value is treated as Average.
StepAdjustments:
- MetricIntervalLowerBound: !Ref AWS::NoValue
MetricIntervalUpperBound: 0
ScalingAdjustment: 1
# https://docs.aws.amazon.com/autoscaling/application/userguide/application-auto-scaling-step-scaling-policies.html
ProdComputeAutoScalingScaleOutPolicy:
Type: 'AWS::ApplicationAutoScaling::ScalingPolicy'
Condition: CreateProdResources # only create if it is Prod env
Properties:
PolicyName: !Sub ${DockerContainerName}-ProdComputeAutoScalingScaleOutPolicy
PolicyType: StepScaling
ScalingTargetId: !Ref ProdComputeAutoScalingTarget
ScalableDimension: ecs:service:DesiredCount
ServiceNamespace: ecs
StepScalingPolicyConfiguration:
AdjustmentType: ChangeInCapacity # ChangeInCapacity —> Increase or decrease the current capacity of the scalable target by the specified value
Cooldown: 60 # 1 min delay
MetricAggregationType: Minimum # Valid values are Minimum, Maximum, and Average. If the aggregation type is null, the value is treated as Average.
StepAdjustments:
- MetricIntervalLowerBound: 0 # 0 means exactly equal to Metric Threshold which is 10 defined using SSM parameter
MetricIntervalUpperBound: 15 # [Metrice Threshold + 15]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 15 # [Metrice Threshold + 15]
MetricIntervalUpperBound: 25 # [Metrice Threshold + 25]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 25 # [Metrice Threshold + 25]
MetricIntervalUpperBound: 35 # [Metrice Threshold + 35]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 35 # [Metrice Threshold + 35]
MetricIntervalUpperBound: 45 # [Metrice Threshold + 45]
ScalingAdjustment: 1
- MetricIntervalLowerBound: 45 # [Metrice Threshold + 45]
ScalingAdjustment: 1
# ######### https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch-metrics.html #########
# (Total CPU units used by tasks in service) x 100
# Service CPU utilization = ----------------------------------------------------------------------------
# (Total CPU units specified in task definition) x (number of tasks in service)
NonProdCPUNoComputeAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateNonProdResources # only create alarm if it is NonProd env
Properties:
AlarmName: !Sub ${DockerContainerName}-NonProdCPUNoComputeAlarm
AlarmDescription: Alarm if container utilize low CPU based on specified threshold!
Namespace: AWS/ECS # AWS::CloudWatch::Alarm.Period >= 60 for metrics in the AWS/ namespace
MetricName: CPUUtilization
Dimensions:
- Name: ServiceName
Value:
Fn::GetAtt:
- NonProdComputeService
- Name
- Name: ClusterName
Value:
Ref: ComputeCluster
Statistic: Average # Not using Sum since the metric is CPUUtilization
Period: 60 # 60 seconds ( Period must be 10, 30 or a multiple of 60 but 10 and 30 can not be used with namespaces with the following prefix: AWS/ )
EvaluationPeriods: !Ref CpuUtilizationNoComputeOrScaleInAlarmEvaluationPeriodsSSM # setting evaluation period 3 because when there is no task at all usually cpu starts with 0 (this way first evaluation period will already hit even before the task start doing anything) & 2 more as part of taking extra precautions! Change it to 2 if needed but not 1
Threshold: !Ref CpuUtilizationScaleInAlarmThresholdSSM
ComparisonOperator: LessThanOrEqualToThreshold
AlarmActions:
- Ref: NonProdNoComputeAutoScalingPolicy
NonProdInitialQueueDepthAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateNonProdResources # only create alarm if it is NonProd env
Properties:
AlarmName: !Sub ${DockerContainerName}-NonProdInitialQueueDepthAlarm
AlarmDescription: Alarm if queue depth grows beyond specified threshold!
Namespace: AWS/SQS # AWS::CloudWatch::Alarm.Period >= 60 for metrics in the AWS/ namespace
MetricName: ApproximateNumberOfMessagesVisible
Dimensions:
- Name: QueueName
Value : !GetAtt SQSQueue.QueueName
Statistic: Sum
Period: 60 # 60 seconds ( Period must be 10, 30 or a multiple of 60 but 10 and 30 can not be used with namespaces with the following prefix: AWS/ )
EvaluationPeriods: 1
Threshold: 1 # Threshold is 1 for initial depth
ComparisonOperator: GreaterThanOrEqualToThreshold
AlarmActions:
- Ref: NonProdInitialComputeAutoScalingPolicy
NonProdQueueDepthScaleOutAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateNonProdResources # only create alarm if it is NonProd env
Properties:
AlarmName: !Sub ${DockerContainerName}-NonProdQueueDepthScaleOutAlarm
AlarmDescription: Alarm if queue depth grows beyond specified threshold!
Namespace: AWS/SQS # AWS::CloudWatch::Alarm.Period >= 60 for metrics in the AWS/ namespace
MetricName: ApproximateNumberOfMessagesVisible
Dimensions:
- Name: QueueName
Value : !GetAtt SQSQueue.QueueName
Statistic: Sum
Period: 120 # 120 seconds ( Period must be 10, 30 or a multiple of 60 but 10 and 30 can not be used with namespaces with the following prefix: AWS/ )
EvaluationPeriods: 1
Threshold: !Ref QueueDepthScaleOutAlarmThresholdSSM # change this as needed
ComparisonOperator: GreaterThanOrEqualToThreshold
AlarmActions:
- Ref: NonProdComputeAutoScalingScaleOutPolicy
ProdCPUScaleInAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateProdResources # only create alarm if it is Prod env
Properties:
AlarmName: !Sub ${DockerContainerName}-ProdCPUScaleInAlarm
AlarmDescription: Alarm if container utilize low cpu based on specified threshold!
Namespace: AWS/ECS # AWS::CloudWatch::Alarm.Period >= 60 for metrics in the AWS/ namespace
MetricName: CPUUtilization
Dimensions:
- Name: ServiceName
Value:
Fn::GetAtt:
- ProdComputeService
- Name
- Name: ClusterName
Value:
Ref: ComputeCluster
Statistic: Average # Not using Sum since the metric is CPUUtilization
Period: 60 # 60 seconds ( Period must be 10, 30 or a multiple of 60 but 10 and 30 can not be used with namespaces with the following prefix: AWS/ )
EvaluationPeriods: !Ref CpuUtilizationNoComputeOrScaleInAlarmEvaluationPeriodsSSM # setting this 3 as part of taking extra precautions! Change it to 1 if needed
Threshold: !Ref CpuUtilizationScaleInAlarmThresholdSSM
ComparisonOperator: LessThanOrEqualToThreshold
AlarmActions:
- Ref: ProdComputeAutoScalingScaleInPolicy
ProdQueueDepthScaleOutAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateProdResources # only create alarm if it is Prod env
Properties:
AlarmName: !Sub ${DockerContainerName}-ProdQueueDepthScaleOutAlarm
AlarmDescription: Alarm if queue depth grows beyond specified threshold!
Namespace: AWS/SQS # AWS::CloudWatch::Alarm.Period >= 60 for metrics in the AWS/ namespace
MetricName: ApproximateNumberOfMessagesVisible
Dimensions:
- Name: QueueName
Value : !GetAtt SQSQueue.QueueName
Statistic: Sum
Period: 120 # 120 seconds ( Period must be 10, 30 or a multiple of 60 but 10 and 30 can not be used with namespaces with the following prefix: AWS/ )
EvaluationPeriods: 1
Threshold: !Ref QueueDepthScaleOutAlarmThresholdSSM # change this as needed
ComparisonOperator: GreaterThanOrEqualToThreshold
AlarmActions:
- Ref: ProdComputeAutoScalingScaleOutPolicy
In the above template, I have created two sets of resources NonProd and Prod. In lower environments, the DesiredCount of ECS service is set as zero to save cost. Since CloudWatch Alarm takes at least one minute to respond to scaling events and in prod we do not want any delay that's why the DesiredCount is set as one in prod.
One more thing to note is that the scale-out is based on queue depth but scale-in is based on CPU utilization because the consumer-service consumes a message which can take several minutes to finish and I don't want Application Auto-Scaling to reduce tasks when the queue is empty and they are still processing something. In this regard, EC2 has something called
instance scale-in protection which allows you to have control over which queue workers are terminated when your Auto Scaling group scales in. It was not available for Fargate based ECS clusters but AWS just recently introduced a new feature for ECS called task scale-in protection. You can see the blog post here.
Here is the bash script for creating SSM parameters.
#!/usr/bin/env bash
while (($# > 1)); do
case $1 in
--profile) PROFILE="$2" ;;
*) break ;;
esac
shift 2
done
echo "Deleting Parameters..."
aws ssm delete-parameter --profile $PROFILE --name "/$PROFILE/services/consumer-service/AWS_REGION"
aws ssm delete-parameter --profile $PROFILE --name "/config/ecs/FARGATE_SCALING_ENV"
aws ssm delete-parameter --profile $PROFILE --name "/config/ecs/consumer-service/QUEUE_DEPTH_SCALE_OUT_ALARM_THRESHOLD"
aws ssm delete-parameter --profile $PROFILE --name "/config/ecs/consumer-service/CPU_UTILIZATION_SCALE_IN_ALARM_THRESHOLD"
aws ssm delete-parameter --profile $PROFILE --name "/config/ecs/consumer-service/CPU_UTILIZATION_NO_COMPUTE_OR_SCALE_IN_ALARM_EVALUATION_PERIODS"
aws ssm delete-parameter --profile $PROFILE --name "/config/ecs/consumer-service/AUTO_SCALING_TARGET_MAX_CAPACITY"
echo "Creating parameters..."
aws ssm put-parameter --profile $PROFILE --overwrite --cli-input-json '{"Type": "String", "Name": "/'$PROFILE'/services/consumer-service/AWS_REGION", "Value": "us-east-1"}'
aws ssm put-parameter --profile $PROFILE --overwrite --cli-input-json '{"Type": "String", "Name": "/config/ecs/FARGATE_SCALING_ENV", "Value": "non-prod"}' # valid values are: non-prod or prod
aws ssm put-parameter --profile $PROFILE --overwrite --cli-input-json '{"Type": "String", "Name": "/config/ecs/consumer-service/QUEUE_DEPTH_SCALE_OUT_ALARM_THRESHOLD", "Value": "10"}' # if you are increasing this then make sure you are also adjusting step scaling criteria
aws ssm put-parameter --profile $PROFILE --overwrite --cli-input-json '{"Type": "String", "Name": "/config/ecs/consumer-service/CPU_UTILIZATION_SCALE_IN_ALARM_THRESHOLD", "Value": "2"}' # 2%
aws ssm put-parameter --profile $PROFILE --overwrite --cli-input-json '{"Type": "String", "Name": "/config/ecs/consumer-service/CPU_UTILIZATION_NO_COMPUTE_OR_SCALE_IN_ALARM_EVALUATION_PERIODS", "Value": "3"}' # do not set this as 1 in non-prod
aws ssm put-parameter --profile $PROFILE --overwrite --cli-input-json '{"Type": "String", "Name": "/config/ecs/consumer-service/AUTO_SCALING_TARGET_MAX_CAPACITY", "Value": "6"}' # put 1 if you want to disable autoscaling at all
Posted on November 20, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.