Testing in Infrastructure as Code and why Terraform may not be the best option

mraszplewicz

Maciej Raszplewicz

Posted on January 7, 2021

Testing in Infrastructure as Code and why Terraform may not be the best option

You don’t need any special tool to automatically test your IaC code, you can use any programming language and unit testing framework you like.

You should test your Infrastructure as Code because it is code - that is probably obvious. Most likely, you want to do it automatically. Here I will share our approach to testing IaC and how it drove our technology decisions for the DevOpsBox platform (https://www.devopsbox.io/).

The Test Pyramid

Let’s start with some theory. There is an old concept of the test pyramid:

Alt Text

There are often different names on the diagram but the concept is:

  • You should have a lot of fast robust tests, which are often unit tests. It is important they execute in milliseconds and do not depend on any external services.
  • There should be a moderate number of integration tests. These tests often depend on external services, but are not testing the whole system. They are much slower and more fragile than unit tests, probably executed in seconds or minutes, and can sometimes fail e.g. because of the network problems.
  • You should only have a few tests of your whole system. They are really slow and fragile. It is sometimes very hard to convince management and business to that, but this is the reality.

It is important to note that nothing will check how code works, as fast as the unit tests do. You should write them for yourself to check your code. However, there are some misconceptions about what the unit is and how to write good tests. In my opinion, you should always think about what you want to achieve with your code and test this behavior - sometimes you will test a single function/method, sometimes something bigger. You shouldn’t call external dependencies in your unit tests and should write your tests in a way that will not require you to change them after a refactoring. Ports and adapters architecture can help with that. If you have good unit tests for a part of your application, you can assume that it works well, like you often assume that some external program works (e.g. AWS CLI). Then, you don’t have to test every variant in your integration or system tests.

Infrastructure as Code tests examples

Often, when we talk about Infrastructure as Code tools, Terraform comes to our minds. It is great for maintaining your infrastructure state and talking to your cloud provider’s API, but you have to write code in HCL, which is not a real programming language. Is it bad? Sometimes yes, especially when it comes to writing unit tests. Let’s check how we can test the Terraform code and what we can use instead, and have real unit tests!

Our example code under test will create an S3 bucket maintaining naming conventions:

  • Template of the bucket name will be <company name>-<env name>-<app name>-<bucket purpose> (e.g. acme-dev-orders-pictures) if it does not exceed 63 characters (maximum for an s3 bucket)
  • If it does exceed 63 characters, it will be a hash of the name. We will use a substring of a sha256.

Prerequisites

You will need several tools:

  • GO (tested with 1.15.3)
  • Terraform (tested with 0.14.3)
  • AWS CDK (tested with 1.71.0)
  • Java JDK (tested with openjdk 11.0.9)
  • NodeJS (required by AWS CDK, tested with v12.18.3)
  • Docker (tested with 18.09.5)
  • cdklocal (https://github.com/localstack/aws-cdk-local, tested with 1.65.2)

An AWS account with proper credentials is also required. The code will probably work with other versions too.

Terratest

We will test this Terraform code https://github.com/devopsbox-io/example-iac-test/blob/master/terraform/s3/main.tf

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = var.aws_region
}

locals {
  requested_bucket_name = "${var.company_name}-${var.env_name}-${var.app_name}-${var.bucket_purpose}"
  bucket_name = length(local.requested_bucket_name) > 63 ? substr(sha256(local.requested_bucket_name), 0, 63) : local.requested_bucket_name
}

resource "aws_s3_bucket" "bucket" {
  bucket = local.bucket_name
}
Enter fullscreen mode Exit fullscreen mode

and variables https://github.com/devopsbox-io/example-iac-test/blob/master/terraform/s3/variables.tf

variable "aws_region" {
  type = string
}

variable "company_name" {
  type = string
}

variable "env_name" {
  type = string
}

variable "app_name" {
  type = string
}

variable "bucket_purpose" {
  type = string
}
Enter fullscreen mode Exit fullscreen mode

Nothing special here, just the implementation of our example in Terraform.

Tests in Terratest are quite easy to write https://github.com/devopsbox-io/example-iac-test/blob/master/test/s3_module_test.go

func TestS3BucketCreated(t *testing.T) {
    t.Parallel()

    envName := strings.ToLower(random.UniqueId())
    awsRegion := "eu-west-1"

    tests := map[string]struct {
        terraformVariables map[string]interface{}
        expectedBucketName string
    }{
        "short name": {
            terraformVariables: map[string]interface{}{
                "aws_region":     awsRegion,
                "company_name":   "acme",
                "env_name":       envName,
                "app_name":       "orders",
                "bucket_purpose": "pictures",
            },
            expectedBucketName: "acme-" + envName + "-orders-pictures",
        },
        "long name": {
            terraformVariables: map[string]interface{}{
                "aws_region":     awsRegion,
                "company_name":   "acme",
                "env_name":       envName,
                "app_name":       "orders",
                "bucket_purpose": "pictures12345678901234567890123456789012345678901234567890",
            },
            expectedBucketName: sha256String("acme-" + envName + "-orders-pictures12345678901234567890123456789012345678901234567890")[:63],
        },
    }

    for name, testCase := range tests {
        // capture range variables
        name := name
        testCase := testCase
        t.Run(name, func(t *testing.T) {
            t.Parallel()

            terraformModuleDir, err := files.CopyTerraformFolderToTemp("../terraform/s3", "terratest-")
            if err != nil {
                t.Fatalf("Error while creating temp dir %v", err)
            }
            defer os.RemoveAll(terraformModuleDir)

            terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
                TerraformDir: terraformModuleDir,
                Vars:         testCase.terraformVariables,
            })

            defer terraform.Destroy(t, terraformOptions)

            terraform.InitAndApply(t, terraformOptions)

            aws.AssertS3BucketExists(t, awsRegion, testCase.expectedBucketName)
        })
    }
}

func sha256String(str string) string {
    sha256Bytes := sha256.Sum256([]byte(str))
    return hex.EncodeToString(sha256Bytes[:])
}
Enter fullscreen mode Exit fullscreen mode

I have used a pattern called data-driven (or table-driven) tests (read more here https://dave.cheney.net/2019/05/07/prefer-table-driven-tests). Tests are executed in parallel, what is quite nice. To do that you have to copy your module using CopyTerraformFolderToTemp and reassign range variables (two lines after the // capture range variables comment).

Execute this code with:

cd test
go test
Enter fullscreen mode Exit fullscreen mode

There are a few problems here:

  • These are not unit tests. They need to create your infrastructure, so they are slow and fragile.
  • Look at this simple algorithm in HCL and think of something a little bit more complicated - it can be really hard to implement, read, and maintain.
  • You have to use Golang - it is no longer an issue for us, but it used to be.

You can’t use Terratest assertions with LocalStack and there is an open pull request https://github.com/gruntwork-io/terratest/pull/495 solving this issue. Of course, you could write your own assertions and we will do it later in a different language, so let’s skip this now and move on.

Test Terraform with custom “miniframework” based on Spock

We didn’t want to write tests in Golang, so we decided to check if it is possible to use some other language. One of my favorite testing frameworks is Spock - it is JVM based and you write your tests in Groovy. We decided to check it and it turns out that it works very well! You “only” have to write some glue code, what is not that hard to do.

The Terraform code is almost the same, but with LocalStack support, we had to change AWS provider configuration https://github.com/devopsbox-io/example-iac-test/blob/master/terraform/s3/main.tf

provider "aws" {
  region = var.aws_region

  access_key = var.use_localstack ? "fake_access_key" : null
  s3_force_path_style = var.use_localstack
  secret_key = var.use_localstack ? "fake_secret_key" : null
  skip_credentials_validation = var.use_localstack
  skip_metadata_api_check = var.use_localstack
  skip_requesting_account_id = var.use_localstack

  dynamic "endpoints" {
    for_each = var.use_localstack ? [1] : []
    content {
      s3 = "http://localhost:4566"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

and add a single additional variable:

variable "use_localstack" {
  type = bool
  default = false
}
Enter fullscreen mode Exit fullscreen mode

The test looks like this https://github.com/devopsbox-io/example-iac-test/blob/master/src/integTest/groovy/io/devopsbox/infrastructure/test/s3/S3TerraformModuleTest.groovy

class S3TerraformModuleTest extends TerraformIntegrationTest {

    @Shared
    S3 s3

    def setupSpec() {
        s3 = new S3(sdkClients)
    }

    @Unroll
    def "should create s3 bucket #testCase"() {
        given:
        def terraformVariables = new S3TerraformModuleVariables(
                useLocalstack: localstack.enabled,
                awsRegion: awsRegion(),
                companyName: "acme",
                envName: environmentName(),
                appName: "orders",
                bucketPurpose: bucketPurpose,
        )

        when:
        deployTerraformModule("terraform/s3", terraformVariables)

        then:
        s3.checkBucketExists(expectedBucketName)

        cleanup:
        destroyTerraformModule("terraform/s3", terraformVariables)

        where:
        testCase     | bucketPurpose                                                | expectedBucketName
        "short name" | "pictures"                                                   | "acme-" + environmentName() + "-orders-pictures"
        "long name"  | "pictures12345678901234567890123456789012345678901234567890" | DigestUtils.sha256Hex("acme-" + environmentName() + "-orders-pictures12345678901234567890123456789012345678901234567890").substring(0, 63)
    }

    class S3TerraformModuleVariables extends TerraformVariables {
        boolean useLocalstack
        String awsRegion
        String companyName
        String envName
        String appName
        String bucketPurpose
    }
}
Enter fullscreen mode Exit fullscreen mode

The code is quite nice. Notice S3TerraformModuleVariables class - not that bad way to pass input variables to our Terraform stack. One of the problems is the lack of parallelism support (will be supported in Spock 2.0 http://spockframework.org/spock/docs/2.0-M4/parallel_execution.html#parallel-execution). We already have LocalStack support here - long-running tests should be faster because you don’t have to wait for the real infrastructure. However, I think that you should also run your tests with the real cloud - sometimes LocalStack can behave differently, or maybe does not support some cloud resource you need.

To execute this code run:

./gradlew integTest --tests *S3TerraformModuleTest
Enter fullscreen mode Exit fullscreen mode

or with LocalStack:

./gradlew integTest --tests *S3TerraformModuleTest -Dlocalstack.enabled=true
Enter fullscreen mode Exit fullscreen mode

Are there any problems here? Yes:

  • Still no unit tests
  • Still not a real programming language

The “miniframework” glue code

Have I mentioned the glue code? There is some, but really not that much. I will only list all the files, describe them but not paste the whole code here:

It is a mix of Groovy and Java. You can copy this code, use it in your project or maybe even rewrite it into another language. If you think we should create a real framework and provide a jar - please let me know, we will do our best.

As you can see - you don’t need any special tool to test your Infrastructure as Code, only your favorite language, unit testing framework, and a few hours to write some glue code.

Test AWS CDK with custom “miniframework” based on Spock

We really wanted to write our code in a general-purpose programming language and had the ability to unit test it. After some research, we found two frameworks:

  • AWS CDK
  • Pulumi

Only AWS CDK supported Java, so it was our choice, although I have to write a few words about Pulumi: it is really cool, I used it in one of my projects using TypeScript and I was impressed. Coming back to AWS CDK - when we started to write the code, CDK was in its early stages and AWS changed its API very often, but that is a completely different story…

Here is the AWS CDK code for which we will write tests https://github.com/devopsbox-io/example-iac-test/blob/master/src/main/java/io/devopsbox/infrastructure/test/s3/S3Construct.java

public class S3Construct extends Construct {

    public S3Construct(Construct scope, String id, S3ConstructProps props) {
        super(scope, id);

        String bucketName = props.getBucketName();
        new Bucket(this, bucketName, BucketProps.builder()
                .removalPolicy(RemovalPolicy.DESTROY)
                .bucketName(bucketName)
                .build());
    }
}
Enter fullscreen mode Exit fullscreen mode

we moved our bucket naming logic to another class https://github.com/devopsbox-io/example-iac-test/blob/master/src/main/java/io/devopsbox/infrastructure/test/s3/S3ConstructProps.java

public class S3ConstructProps extends ConstructProps {
    public static final int BUCKET_NAME_MAX_LENGTH = 63;

    private final String bucketPurpose;

    public S3ConstructProps(String companyName, String envName, String appName, String bucketPurpose) {
        super(companyName, envName, appName);
        this.bucketPurpose = bucketPurpose;
    }

    public String getBucketPurpose() {
        return bucketPurpose;
    }

    public String getBucketName() {
        String bucketName = getCompanyName() + "-" + getEnvName() + "-" + getAppName() + "-" + getBucketPurpose();
        if (bucketName.length() > BUCKET_NAME_MAX_LENGTH) {
            bucketName = DigestUtils.sha256Hex(bucketName).substring(0, BUCKET_NAME_MAX_LENGTH);
        }
        return bucketName;
    }
}
Enter fullscreen mode Exit fullscreen mode

There is also a base class for all construct property classes with a set of common properties https://github.com/devopsbox-io/example-iac-test/blob/master/src/main/java/io/devopsbox/infrastructure/test/ConstructProps.java

public class ConstructProps implements Serializable {
    private final String companyName;
    private final String envName;
    private final String appName;

    public ConstructProps(String companyName, String envName, String appName) {
        this.companyName = companyName;
        this.envName = envName;
        this.appName = appName;
    }

    public String getCompanyName() {
        return companyName;
    }

    public String getEnvName() {
        return envName;
    }

    public String getAppName() {
        return appName;
    }
}
Enter fullscreen mode Exit fullscreen mode

And two standard CDK classes:

The test is similar to the one written for Terraform https://github.com/devopsbox-io/example-iac-test/blob/master/src/integTest/groovy/io/devopsbox/infrastructure/test/s3/S3CdkConstructTest.groovy

class S3CdkConstructTest extends CdkIntegrationTest {

    @Shared
    S3 s3

    def setupSpec() {
        s3 = new S3(sdkClients)
    }

    def "should create s3 bucket"() {
        given:
        def stackId = "S3BucketConstructTest" + environmentName()
        def constructProps = new S3ConstructProps(
                "acme",
                environmentName(),
                "orders",
                "pictures"
        )

        when:
        deployCdkConstruct(stackId, S3Construct, constructProps)

        then:
        s3.checkBucketExists("acme-" + environmentName() + "-orders-pictures")

        cleanup:
        destroyCdkConstruct(stackId, S3Construct, constructProps)
    }
}
Enter fullscreen mode Exit fullscreen mode

To execute this code run:

./gradlew integTest --tests *S3CdkConstructTest
Enter fullscreen mode Exit fullscreen mode

or with LocalStack:

./gradlew integTest --tests *S3CdkConstructTest -Dlocalstack.enabled=true
Enter fullscreen mode Exit fullscreen mode

We are not testing all the cases here, just a single integration test, because we can finally write unit tests! The code looks like this https://github.com/devopsbox-io/example-iac-test/blob/master/src/test/groovy/io/devopsbox/infrastructure/test/s3/S3ConstructPropsTest.groovy

class S3ConstructPropsTest extends Specification {

    @Unroll
    def "should return s3 bucket #testCase"() {
        given:
        def props = new S3ConstructProps(
                "acme",
                "dev",
                "orders",
                bucketPurpose,
        )

        when:
        def bucketName = props.bucketName

        then:
        bucketName == expectedBucketName

        where:
        testCase     | bucketPurpose                                                | expectedBucketName
        "short name" | "pictures"                                                   | "acme-dev-orders-pictures"
        "long name"  | "pictures12345678901234567890123456789012345678901234567890" | DigestUtils.sha256Hex("acme-dev-orders-pictures12345678901234567890123456789012345678901234567890").substring(0, 63)
    }
}
Enter fullscreen mode Exit fullscreen mode

Just run it with ./gradlew test and it completes in milliseconds!

Finally, we can write the code in a real programming language of our choice and create unit tests. Is it perfect? Certainly not. Our integration tests run in a different process, so there are some drawbacks. There is support for running in the same process in Pulumi (https://github.com/pulumi/pulumi/issues/3901) but not in AWS CDK yet (https://github.com/aws/aws-cdk/issues/601). We can also improve our “miniframework” and run tests using a chosen IAM role - we do that in DevOpsBox already, but it is not included here for the sake of simplicity.

More “miniframework” glue code…

We have to add a few files to our “miniframework” to support AWS CDK:

A little bit less than for Terraform, because we can reuse some classes written before.

Conclusion

The ability to write unit tests was one of the key factors behind choosing AWS CDK as our Infrastructure as Code tool, and after choosing it we found it very convenient to write IaC code in a programming language of our choice, have good code structure, do refactorings, use design patterns, have a great IDE support, use external libraries and much more. It’s great that nowadays we can write IaC in a general-purpose programming language and it is still declarative.

I hope that the terraform-cdk project (https://github.com/hashicorp/terraform-cdk) will be usable soon and maintained in the future. Then, we will be able to have a cake and eat it too!

For more details about the DevOpsBox platform please visit https://www.devopsbox.io/

💖 💪 🙅 🚩
mraszplewicz
Maciej Raszplewicz

Posted on January 7, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related