Using the Go AWS API to get AWS CloudWatch Metrics for an S3 bucket.

Earlier this year, one of the largest web applications I've ever developed went live. As part of supporting the organisations using the application I'd built an operations dashboard that lets me see all the stats and logs from the application in one place.

The application uses a private S3 bucket to store cached files. The S3 bucket has a life cycle policy to delete files twenty hours after being created. I wanted the dashboard to show the count of items in this bucket and the total size of the bucket to keep track of consumption.

This article explains how I used the Go API (Application Programming Interface) to extract the metrics from CloudWatch

Amazon CloudWatch metrics for Amazon S3 can help you understand and improve the performance of applications that use Amazon S3. There are several ways that you can use CloudWatch with Amazon S3.

Amazon CloudWatch Daily Storage Metrics for Buckets

Monitor bucket storage using CloudWatch, which collects and processes storage data from Amazon S3 into readable, daily metrics. These storage metrics for Amazon S3 are reported once per day and are provided to all customers at no additional cost.

Amazon S3 CloudWatch Daily Storage Metrics for Buckets

The CloudWatch AWS/S3 namespace includes the following daily storage metrics for buckets.

BucketSizeBytes

The amount of data in bytes stored in a bucket in the STANDARD storage class, INTELLIGENT_TIERING storage class, Standard - Infrequent Access (STANDARD_IA) storage class, OneZone - Infrequent Access (ONEZONE_IA), Reduced Redundancy Storage (RRS) class, Deep Archive Storage (S3 Glacier Deep Archive) class or, Glacier (GLACIER) storage class.

This value is calculated by summing the size of all objects in the bucket (both current and non-current objects), including the size of all parts for all incomplete multipart uploads to the bucket.

Valid storage type filters: StandardStorage, IntelligentTieringFAStorage, IntelligentTieringIAStorage, IntelligentTieringAAStorage, IntelligentTieringDAAStorage, StandardIAStorage, StandardIASizeOverhead, StandardIAObjectOverhead, OneZoneIAStorage, OneZoneIASizeOverhead, ReducedRedundancyStorage, GlacierStorage, GlacierStagingStorage, GlacierObjectOverhead, GlacierS3ObjectOverhead, DeepArchiveStorage, DeepArchiveObjectOverhead, DeepArchiveS3ObjectOverhead and, DeepArchiveStagingStorage (see the StorageType dimension) Units: Bytes Valid statistics: Average

NumberOfObjects

The total number of objects stored in a bucket for all storage classes except for the GLACIER storage class. This value is calculated by counting all objects in the bucket (both current and noncurrent objects) and the total number of parts for all incomplete multipart uploads to the bucket. Valid storage type filters: AllStorageTypes (see the StorageType dimension) Units: Count Valid statistics: Average

Using Go to Obtain S3 Bucket Metrics

The first step is to install the AWS software development kit (SDK) for GoLang. This is done by using the following Go get command issued at the terminal or command prompt.

go get github.com/aws/aws-sdk-go/...

Once the AWS SDK has been installed, you'll need to import the relevant sections into your program to interact with CloudWatch.

package main

import (
    "fmt"
    "log"
    "strconv"
    "strings"
    "time"

    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/credentials"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/cloudwatch"
)

Within the main part of your program, you need to create an AWS session using the NewSession function. In the example below the newly created session is assigned to the awsSession variable. The session object gets created by supplying the AWS region identifier and your AWS credentials for Access Key and Secret Key. You can obtain these keys from your AWS account. The NewSession function returns a pointer to the session object, and if there was an error, i.e. you specified an invalid region, then an error struct is returned.

You should, therefore, check the status of the error variable and handle it appropriately for your use case. In this example, we log the error to the console as a fatal message. Logfatal equivalent to Print() followed by a call to os.Exit(1); so the program will terminate.

var err error
var awsSession *session.Session

func main() {

    awsSession, err = session.NewSession(&aws.Config{
        Region: aws.String("eu-west-2"),
        Credentials: credentials.NewStaticCredentials(
            os.Getenv("AccessKeyID"),
            os.Getenv("SecretAccessKey"),
            ""),
    })

    if err != nil {
        log.Fatal(err)
    }
}

Once you've got a pointer to a valid AWS session you can reuse that session to make multiple calls against the CloudWatch API.

Obtaining the S3 Bucket Size Metric

The getS3BucketSize function shown below takes one parameter, which is the bucket name for which we want to get the CloudWatch metrics. The function returns three values. The first value is the average data point which contains the bucket size. Secondly, the date/time that CloudWatch logged the value. Finally, the third parameter returns any errors within the function.

We're requesting the metric for S3 objects stored under the StandardStorage tier in the example below. If you want the same function to query different storage types, then you'd need to make this an input parameter to the function.

After getting the result from the GetMetricStatistics function call, we iterate over the result.Datapoints getting the results to return in the return values.

func getS3BucketSize(bucketName string) (float64, time.Time, error) {
    var bucketSize float64
    var metricDateTime time.Time

    svc := cloudwatch.New(awsSession)

    result, err := svc.GetMetricStatistics(&cloudwatch.GetMetricStatisticsInput{
        MetricName: aws.String("BucketSizeBytes"),
        Namespace:  aws.String("AWS/S3"),
        StartTime:  aws.Time(time.Now().Add(-48 * time.Hour)),
        EndTime:    aws.Time(time.Now()),
        Period:     aws.Int64(3600),
        Statistics: []*string{aws.String("Average")},
        Dimensions: []*cloudwatch.Dimension{
            &cloudwatch.Dimension{
                Name:  aws.String("BucketName"),
                Value: aws.String(bucketName),
            },
            &cloudwatch.Dimension{
                Name:  aws.String("StorageType"),
                Value: aws.String("StandardStorage"),
            },
        },
    })

    if err != nil {
        return 0, time.Now(), err
    }

    for _, dataPoint := range result.Datapoints {
        bucketSize = *dataPoint.Average
        metricDateTime = *dataPoint.Timestamp
    }

    return bucketSize, metricDateTime, err
}

Obtaining the S3 Bucket Count Metric

To get the count of files/objects stored in the S3 bucket, you'd need to change the MetricName from BucketSizeBytes to NumberOfObjects and change the storage type to AllStorageTypes as shown in the snippet below.

    MetricName: aws.String("NumberOfObjects"),
    Namespace:  aws.String("AWS/S3"),
    StartTime:  aws.Time(time.Now().Add(-48 * time.Hour)),
    EndTime:    aws.Time(time.Now()),
    Period:     aws.Int64(3600),
    Statistics: []*string{aws.String("Average")},

    Dimensions: []*cloudwatch.Dimension{
        &cloudwatch.Dimension{
            Name:  aws.String("BucketName"),
            Value: aws.String(bucketName),
        },
        &cloudwatch.Dimension{
            Name:  aws.String("StorageType"),
            Value: aws.String("AllStorageTypes"),
        },
    },

Conclusion

I'm using these functions within a Linux command-line utility which gets invoked from a cron job daily. The results from the last thirty days are stored in a database so that I can present the values on my dashboard as a Sparkline chart. A Sparkline is a small line chart, typically drawn without axes or coordinates. It presents the general shape of the variation and lets me quickly see the growth trends.

Blog