Meir Gabay
Posted on September 20, 2019
Update - 17-Oct-2019 changelog: Terraform released a new function named cidrsubnets, this function creates a list of cidr-subnets. This function is great, and I recommend using it. Even though this function shortens some parts of this tutorial, you should still read it if you want to learn how to use functions in Terraform.
Objectives
- Create sets of subnets dynamically
- Learn advanced concepts in Terraform
- map variable and lookup function
- for loop and conditional for loop
- index function in a for loop
Knowledge and assumptions
- You are familiar with subnetting (a.b.c.d/xx) - a great online tool for calculating subnets - cidr.xyz
- You are using Terraform v0.12+
- You have previous experience with:
- You know what's a for loop
- I'll be using the module terraform-aws-modules/vpc/aws for the VPC. This module requires lists of subnets (private, database, public), which is exactly what we're going to create
Issue #1: Subnets per environment
Hardcoding subnets per environment can be a significant overhead. By doing so, you might end up writing a lot of lines with hardcoded network prefixes.
Moreover, it takes off the flexibility that infrastructure as code (IaC) has to offer.
Set cidr_ab per environment
cidr_ab to the rescue! Set the start of the VPC CIDR as a map variable, and map each cidr_ab value to the appropriate environment. If you're not using all four, remove it from the mapping.
A subnet pattern (prefix) is: a.b.c.d/xx , so this will cover the a.b part of all subnets per environment. For example:
variable "cidr_ab" {
type = map
default = {
development = "172.22"
qa = "172.24"
staging = "172.26"
production = "172.28"
}
}
Get cidr_ab per environment
Maps and lookup functions provide great functionality, and in our case, it makes it easy to get the cidr_ab per environment.
Here's the syntax of the lookup function: lookup(map, key, default)
To create a local list private_subnets
with the relevant cidr_ab per environment, use the following:
Reminder: To concatenate an expression and text: "${expression} my text"
locals {
private_subnets = [
"${lookup(var.cidr_ab, var.environment)}.1.0/24",
"${lookup(var.cidr_ab, var.environment)}.2.0/24",
"${lookup(var.cidr_ab, var.environment)}.3.0/24"
]
}
Keep in mind that we need to do that for database and public subnets aswell.
Issue #1: Full Solution
Assuming we want to create the following subnets: private, database, and public. Here's how our variables.tf
and vpc.tf
files should look like:
variables.tf
variable "cidr_ab" {
type = map
default = {
development = "172.22"
qa = "172.24"
staging = "172.26"
production = "172.28"
}
}
locals {
private_subnets = [
"${lookup(var.cidr_ab, var.environment)}.1.0/24",
"${lookup(var.cidr_ab, var.environment)}.2.0/24",
"${lookup(var.cidr_ab, var.environment)}.3.0/24"
]
database_subnets = [
"${lookup(var.cidr_ab, var.environment)}.11.0/24",
"${lookup(var.cidr_ab, var.environment)}.12.0/24",
"${lookup(var.cidr_ab, var.environment)}.13.0/24"
]
public_subnets = [
"${lookup(var.cidr_ab, var.environment)}.64.0/24",
"${lookup(var.cidr_ab, var.environment)}.65.0/24",
"${lookup(var.cidr_ab, var.environment)}.66.0/24"
]
}
variable "environment" {
type = string
description = "Options: development, qa, staging, production"
}
vpc.tf
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~>2.0"
name = "my-vpc"
cidr = "${lookup(var.cidr_ab, var.environment)}.0.0/16"
private_subnets = local.private_subnets
database_subnets = local.database_subnets
public_subnets = local.public_subnets
azs = ["us-west-2a", "us-west-2b", "us-west-2c"]
# omitted arguments for brevity
}
Issue #2: Subnets per Availability Zone
In issue #1, we solved the subnets per environment, but the Availability Zones (azs) were hardcoded.
One might think; "Let's simply create a map variable for the region, and then use the region's name followed by 'a', 'b', 'c'". For example:
variable "region" {
type = map(string)
default = {
"development" = "us-west-2"
"qa" = "us-east-2"
"staging" = "us-east-1"
"production" = "ca-central-1"
}
}
module "vpc" {
# omitted arguments for brevity
azs = [
"${lookup(var.region, var.environment)}a",
"${lookup(var.region, var.environment)}b",
"${lookup(var.region, var.environment)}c",
]
}
If all the regions that you're using have the same amount of availability zones - then the above solution is perfect. Unfortunately, that's not true when it comes to ca-central-1 (Canada Central) which has only two availability zones. Another example would be us-east-1 (Virginia) which has six availability zones.
Note: Use AWS Global Infrastructure to find out the number of availability zones per region. And Regions and Availability Zones to figure out the region's code (e.g., eu-west-1)
data aws_availability_zones
We can get the availability zones of the region that is currently taking place, by using data "availability_zones".
Here's is how we do it:
data "aws_availability_zones" "available" {
state = "available"
}
Some explanations regarding the code above:
-
data
- get existing data/resources available in your account -
aws_availability_zones
- gets the list of availability zones in the current region -
available
- a name for that data, it's important to pick a name that reflects the meaning of the data -
state = "available"
- filters out availability zones that currently experience outages
Let's save the data in a local value. You might be wondering, "Why are we using local values? We also did it with the subnets, why oh why?"
Rule of thumb - if there's any chance you're going to manipulate a variable/value, use a local value. This trick provides granularity to manage local.var_name
. It is best to manage variables behind the scenes without touching the infrastructure's code (vpc.tf, rds.tf, s3.tf files). Worst case scenario - we've used a local value even though we haven't manipulated it.
Saving the availability zones names list in a local value:
locals {
availability_zones = data.aws_availability_zones.available.names
}
Issue #2: Full solution
variables.tf
variable "cidr_ab" {
type = map
default = {
development = "172.22"
qa = "172.24"
staging = "172.26"
production = "172.28"
}
}
locals {
private_subnets = [
"${lookup(var.cidr_ab, var.environment)}.1.0/24",
"${lookup(var.cidr_ab, var.environment)}.2.0/24",
"${lookup(var.cidr_ab, var.environment)}.3.0/24"
]
database_subnets = [
"${lookup(var.cidr_ab, var.environment)}.11.0/24",
"${lookup(var.cidr_ab, var.environment)}.12.0/24",
"${lookup(var.cidr_ab, var.environment)}.13.0/24"
]
public_subnets = [
"${lookup(var.cidr_ab, var.environment)}.64.0/24",
"${lookup(var.cidr_ab, var.environment)}.65.0/24",
"${lookup(var.cidr_ab, var.environment)}.66.0/24"
]
}
data "aws_availability_zones" "available" {
state = "available"
}
locals {
availability_zones = data.aws_availability_zones.available.names
}
variable "environment" {
type = string
description = "Options: development, qa, staging, production"
}
vpc.tf
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~>2.0"
name = "my-vpc"
cidr = "${lookup(var.cidr_ab, var.environment)}.0.0/16"
private_subnets = local.private_subnets
database_subnets = local.database_subnets
public_subnets = local.public_subnets
azs = local.availability_zones
# omitted arguments for brevity
}
Issue #3: Dynamic cidr_c
That's the part where I'm greedy, and I want to populate cidr_c according to the number of availability zones. So far we've covered a.b.c.d/xx, now we will cover this part a.b.c.d/xx
For example (pseudo-code):
# define cidr_c per subnet
cidr_c_private_subnets = 1
cidr_c_database_subnets = 11
cidr_c_public_subnets = 64
number_of_azs = 2
# dynamically populate lists of subnets
private_subnets = ["a.b.1.d/xx", "a.b.2.d/xx"]
database_subnets = ["a.b.11.d/xx", "a.b.12.d/xx"]
public_subnets = ["a.b.64.d/xx", "a.b.65.d/xx"]
for loop
Dynamically populating sounds like a loop, and in Terraform we got the for loop.
Terraform's for loop reminds me of Python's list comprehension, which means - create a new list with a for loop.
Syntax of the for loop statement:
[for item in list: do_something_on(item)]
Pseudo-code of the loop that we need:
for availability_zone in local.availability_zones:
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_private_subnets + index_of_availability_zone}.0/24"
Some explanations regarding the code above:
-
availability_zone
- I picked the name of the iterator, it could also beaz
or any other name -
index_of_availability_zone
- First index of a list is zero (0)
- Assuming
${lookup(var.cidr_ab, var.environment)
equals to one (1) for private_subnets - Adding the current index of the availability zone to
${lookup(var.cidr_ab, var.environment)
, result in:- 1 + 0
- 1 + 1
- 1 + 2
get the index of the availability zone
The index function to the rescue!
Syntax of index function:
index(list_to_search_in, value_to_search_for)
The list we're searching in is local.availability_zones
and the value we're looking for is the loop's iterator, availability_zone
in our case.
Replacing index_of_availability_zone
with index(local.availability_zones, az)
result in:
locals {
private_subnets = [
for az in local.availability_zones:
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_private_subnets + index(local.availability_zones, az)}.0/24"
]
}
Issue #3: Full solution
The iterator's name in each loop is az
, I picked a shorter name than availability_zone
to make it more readable. Since we use local values, we only need to change the variables.tf
file, no need to touch vpc.tf
file
variables.tf
variable "cidr_ab" {
type = map
default = {
development = "172.22"
qa = "172.24"
staging = "172.26"
production = "172.28"
}
}
locals {
cidr_c_private_subnets = 1
cidr_c_database_subnets = 11
cidr_c_public_subnets = 64
}
data "aws_availability_zones" "available" {
state = "available"
}
locals {
availability_zones = data.aws_availability_zones.available.names
}
variable "environment" {
type = string
description = "Options: development, qa, staging, production"
}
locals {
private_subnets = [
for az in local.availability_zones :
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_private_subnets + index(local.availability_zones, az)}.0/24"
]
database_subnets = [
for az in local.availability_zones :
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_database_subnets + index(local.availability_zones, az)}.0/24"
]
public_subnets = [
for az in local.availability_zones :
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_public_subnets + index(local.availability_zones, az)}.0/24"
]
}
Issue #4: Limit number of subnets
In solution #3 we populated subnets according to the number of availability zones, which is excellent, but this can lead to unwanted behavior when using the module terraform-aws-modules/vpc/aws.
If you want to have a set of subnets per availability zone, without caring for how many subnets are created per region, you can stop here.
It will be easier to explain with an example:
ca-central-1 (Canada Central) - 2 availability zones, hence 6 subnets
us-west-2 (Oregon) - 4 availability zones, hence 12 subnets
If you wish the number of subnets to be similar across all environments (and regions) - keep on reading. For example:
Let's assume we must use ca-central-1 (Canada Central), and because of that, we are forced to use only two availability zones in each region.
ca-central-1 (Canada Central) - 2 availability zones, hence 6 subnets
us-west-2 (Oregon) - 4 availability zones, forcing maximum of 6 subnets (2 per type of subnet)
In total, we will have six subnets in our VPC (not including the 'default' one that comes with the VPC).
Rule of thumb - We need the lowest common number of availability zones that are available in all the regions we intend to use.
Maximum number of subnets
We will now initialize three more variables, subnet_max
per subnet.
variables.tf
locals {
cidr_c_private_subnets = 1
cidr_c_database_subnets = 11
cidr_c_public_subnets = 64
max_private_subnets = 2
max_database_subnets = 2
max_public_subnets = 2
}
Conditional for loop
Luckily the for loop has a built-in keyword if, which still reminds me of Python's list comprehension :)
Syntax of for loop with condition:
[for item in list: do_something_on(item) if expression]
Our goal is to iterate over the availability zones until we reach the maximum desired number of subnets.
Issue #4: Full Solution
And of course, we only need to change the variables.tf
file.
variables.tf
variable "cidr_ab" {
type = map
default = {
development = "172.22"
qa = "172.24"
staging = "172.26"
production = "172.28"
}
}
locals {
cidr_c_private_subnets = 1
cidr_c_database_subnets = 11
cidr_c_public_subnets = 64
max_private_subnets = 2
max_database_subnets = 2
max_public_subnets = 2
}
data "aws_availability_zones" "available" {
state = "available"
}
locals {
availability_zones = data.aws_availability_zones.available.names
}
variable "environment" {
type = string
description = "Options: development, qa, staging, production"
}
locals {
private_subnets = [
for az in local.availability_zones :
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_private_subnets + index(local.availability_zones, az)}.0/24"
if index(local.availability_zones, az) < local.max_private_subnets
]
database_subnets = [
for az in local.availability_zones :
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_database_subnets + index(local.availability_zones, az)}.0/24"
if index(local.availability_zones, az) < local.max_database_subnets
]
public_subnets = [
for az in local.availability_zones :
"${lookup(var.cidr_ab, var.environment)}.${local.cidr_c_public_subnets + index(local.availability_zones, az)}.0/24"
if index(local.availability_zones, az) < local.max_public_subnets
]
}
Summary
- Some solution architects might prefer hard-coding the subnets prefixes in the
vpc.tf
file, which in most cases will work fine. The fact that I had only to change thevariables .tf
file each time is priceless - The logic behind the order of variables in
variables.tf
file -- variables that should be modified, such as:
- region per environment (map)
- local cidr_c and max_subnets for each type of subnet
- static variables/data, such as:
- data "aws_availability_zones" "available"
- local.availability_zones
- local.subnet_list for each type of subnet
- variables that should be modified, such as:
- I find that using local values instead of variables is a good approach, and it provides the flexibility I need when designing the infrastructure
- The only case where we need variables is when we want to prompt for values, for example:
var.environment
- Always use local values; they provide the ability to use functions, unlike variables
- I used
var.environment
andvar.cidr_ab
- but I should've assigned them to local values. I used variables because I wanted to make this tutorial versatile. Remember- local values are the best
- The only case where we need variables is when we want to prompt for values, for example:
- AWS keeps adding availability zones (AZs), so the number of availability zones per region in the examples might be outdated. Be sure to check the current number of AZs
Did you like this tutorial? Clap/heart/unicorn and share it with your friends and colleagues.
Didn't you like it? Let me know which parts and I'll take care of it.
My next article will be about Terraform Cloud and Terraform's Workspaces.
Originally published at https://www.prodops.io.
Posted on September 20, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.