Getting Started with AWS and Terraform: Setting Up CloudWatch Alarms for CPU Utilization on AWS EC2 Instances with Terraform
Chinmay Tonape
Posted on January 26, 2024
In a previous post, we walked through the process of hosting a Windows IIS web server on an AWS EC2 Windows instance using Terraform.
In this post, we will delve into the realm of AWS CloudWatch Alarms to monitor CPU utilization and set up email notifications. To achieve this, we will create Terraform modules to organize and manage separate components of our infrastructure.
Architecture Overview
Before diving into the setup, let's briefly revisit the architecture we'll be working with:
Step 1: Create VPC, Network Components, and EC2 Instances
Refer to our previous posts for detailed instructions on setting up a VPC and network components along with two Linux EC2 instances in separate Availability Zones (AZs). Ensure to enable monitoring on EC2 instances for one-minute interval monitoring. Note that continuous monitoring may lead to increased charges.
Step 2: Set Up SNS Topic for Email Notifications
Create a Simple Notification Service (SNS) topic and add a subscriber with a disposable email ID (numerous free online disposable email services are available).
####################################################
# Create an SNS topic with a email subscription
####################################################
resource "aws_sns_topic" "topic" {
name = "WebServer-CPU_Utilization_alert"
}
resource "aws_sns_topic_subscription" "topic_email_subscription" {
count = length(var.email_address)
topic_arn = aws_sns_topic.topic.arn
protocol = "email"
endpoint = var.email_address[count.index]
}
Step 3: Create CloudWatch Metric Alarms
Create CloudWatch metric alarms for CPU utilization, individually for each EC2 instance. Set the alarm action as the Amazon Resource Name (ARN) of the SNS Topic. Key parameters such as metric name, threshold, statistic, and comparison are crucial for effective monitoring.
####################################################
# Create a cloudwatch alarm for EC2 instances and alarm_actions to SNS topic
####################################################
resource "aws_cloudwatch_metric_alarm" "ec2_cpu" {
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "60" #seconds
statistic = "Average"
threshold = "80"
alarm_description = "This metric monitors ec2 cpu utilization"
treat_missing_data = "notBreaching"
insufficient_data_actions = []
alarm_actions = [aws_sns_topic.topic.arn]
count = length(module.web.instance_ids)
alarm_name = "cpu-utilization-${element(module.web.instance_ids, count.index)}"
dimensions = {
InstanceId = element(module.web.instance_ids, count.index)
}
}
Running Terraform
Execute the following commands to automate the infrastructure setup:
terraform init
terraform plan -var-file=aws.tfvars
terraform apply -var-file=aws.tfvars -auto-approve
Once the terrform apply completed successfully it will show following:
Apply complete! Resources: 14 added, 0 changed, 0 destroyed.
Testing CloudWatch Alarms with SNS topic
Check that SNS topic and its subscription is confirmed:
Once the Terraform apply process completes successfully, wait for some time to observe both alarms in an "OK" state, indicating that CPU utilization is within the threshold.
Metric sometimes may go to "INSUFFICIENT_DATA" state because of multiple reasons. Refer to https://repost.aws/knowledge-center/cloudwatch-alarm-insufficient-data-state
To test the alarms, stress one EC2 instance by installing and starting the stress utility.
amazon-linux-extras install epel -y
sudo yum install stress -y
stress -c 1 --backoff 300000000 -t 30m
After the monitoring interval, the CloudWatch alarm will trigger, and you will receive an email with detailed information.
I stressed 2nd EC2 instance also:
Both EC2 instances in alarm state:
Cleanup
To prevent unnecessary costs, remember to stop AWS components by executing the following command:
terraform destroy -auto-approve
Congratulations! We successfully deployed CloudWatch alarms and an SNS topic with email subscriptions to receive alarm details.
In our next module, we will explore application load balancers to enhance resilience and availability. Happy Coding!
Resources
GitHub Link: https://github.com/chinmayto/terraform-aws-linux-webserver-cloudwatch-sns
CloudWatch Documentation:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html
SNS Documentation:
https://docs.aws.amazon.com/sns/latest/dg/welcome.html
Posted on January 26, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
September 3, 2024