Cloud automation with Ansible series: Dynamic inventory
XLAB Steampunk
Posted on May 12, 2021
WORDS BY Sašo Stanovnik
If you have any decent amount of servers in the cloud, you want to use automation to create and manage them. Even with few servers, having your infrastructure specified as code serves the much-needed dual purpose of documenting exactly what is necessary for your setup and giving you the ability to deploy and redeploy your stack quickly and reliably.
Ansible is a great tool for this! You can provision your infrastructure and then deploy applications onto it with a single tool, at the same time. In a previous post, we showed how to provision your infrastructure on AWS using Ansible. In this post, the first in the Cloud automation with Ansible series, we’ll be looking at the glue between provisioning and deployment. That is Ansible’s dynamic inventory.
What is a dynamic inventory?
There are generally two steps of deploying applications: provisioning whatever resources the application needs and then deploying and configuring the application on top of them. What binds the two steps together is an enumeration of resources, which in Ansible is called an inventory.
The inventory in Ansible is dynamic, which means Ansible itself figures out what resources–servers–exist at runtime. This is in contrast to having a static inventory, sometimes called a local state, which is a single source of truth for everything infrastructure-related—if it isn’t there, it’s not real.
Straight off the bat this tingles one our arachno-senses. What if something changes out of our control? Well, we could argue no such changes should be allowed, since there should be a clear process of creating infrastructure resources. However, the real world is seldom this friendly and unexpected things happen all the time.
Let’s look at how a dynamic inventory can be just as practical, if not more, as a static inventory. With the usual caveat about static analysis not being available, of course.
In this blog post, we’re going to look at how we parse a set of machines on DigitalOcean and how we can use dynamic inventory configuration to select a subset of machines for deployment. We’ll be going through everything from scratch, so feel free to follow along to play around with different parameters!
What you’ll need to start
We’ll first install ansible-core
into a new virtual environment to keep things clean. Then, we’ll install the community.digitalocean
collection so we have access to its content. Finally, we export the DO_API_TOKEN
environment variable so the playbooks can authenticate against the API and we’re not hardcoding secrets into our playbooks.
# this makes things easy to clean up
mkdir steampunk-trials && cd steampunk-trials/
python3 -m venv .venv && source .venv/bin/activate
pip install -U pip wheel
pip install ansible-core==2.11.0
ansible-galaxy collection install community.digitalocean
export DO_API_TOKEN=<YOUR API TOKEN>
With this, we’re ready to start!
Creating some machines
In order to do anything remotely useful with inventories, let’s create two droplets. We’ll be doing this through a playbook, of course. The Ansible way.
- hosts: localhost
vars:
your_pubkey: YOUR_PUBKEY
tasks:
- community.digitalocean.digital_ocean_sshkey:
name: stempunk-pubkey
state: present
ssh_pub_key: "{{ your_pubkey }}"
register: pubkey
- community.digitalocean.digital_ocean_droplet:
name: "steampunk-{{ item.type }}-{{ item.index }}"
tags: "{{ item.tags }}"
unique_name: true
ssh_keys:
- "{{ pubkey.data.ssh_key.fingerprint }}"
size: s-1vcpu-1gb
region: fra1
image: centos-8-x64
state: present
loop:
- type: appserver
index: 1
tags:
- steampunk-test
- app
- type: dbserver
index: 1
tags:
- steampunk-test
- db
There are two tasks, both creating identical virtual machines. Since this is a proof of concept for an inventory, we won’t be running anything on them, so we don’t need to do anything special, we only need the machines to be accessible. The only difference between the two tasks are the tags. We pretend the first one is an application server and the second is a database server. This way we can show off grouping functionalities later. We also tag both with a “project name” so they’re easier to identify and delete later. Run it using the usual:
$ ansible-playbook playbook.yml
Finally getting an inventory
To use a dynamic inventory in Ansible, there needs to be an inventory plugin written for the provider. Fortunately, the community DigitalOcean collection has the community.digitalocean.digitalocean
inventory plugin!
We need to use a configuration file to both tell Ansible what to use and how to use it. Here is the configuration file we’ll be using, named digitalocean.yml
.
plugin: community.digitalocean.digitalocean
The contents are quite simple. The first line, specifying the plugin
, instructs Ansible what inventory plugin to load and use. All subsequent lines (we’ll add those later) DigitalOcean-specific plugin options. We’ll be leaving everything at its default settings.
We don’t need to add the API token here explicitly, as we’ve exported the DO_API_TOKEN
variable, which the inventory plugin is able to grab and use.
Inventory configuration files choose inventory plugins using a combination of the filename and the
plugin
variable. For thecommunity.digitalocean.digitalocean
inventory plugin, the configuration file name must end with(do_hosts|digitalocean|digital_ocean).(yaml|yml)
.
To get an inventory without running any playbooks, we use ansible-inventory
.
$ ansible-inventory -i digitalocean.yml --graph --vars
@all:
|--@ungrouped:
| |--steampunk-appserver-1
| | |--{do_id = 243276125}
| | |--{do_name = steampunk-appserver-1}
| | |--{do_networks = {'v4': [{'ip_add [...] ': []}}
| | |--{do_region = {'name': 'Frankfur [...] 6gb']}}
| | |--{do_size_slug = s-1vcpu-1gb}
| |--steampunk-dbserver-1
| | |--{do_id = 249985641}
| | |--{do_name = steampunk-dbserver-1}
| | |--{do_networks = {'v4': [{'ip_add [...] ': []}}
| | |--{do_region = {'name': 'Frankfur [...] 6gb']}}
| | |--{do_size_slug = s-1vcpu-1gb}
The --graph
and --vars
flags are there to create an inventory graph instead of a simple list and to display host variables alongside the hosts. Because the account we’re using contains only the two droplets, we only get the two hosts in the output.
You can see the tags attached to the hosts, as well as their names. A very important thing to note here is that this inventory is nearly unusable as it is. Why? It’s because we have no way of connecting to the machines! We’re missing an ansible_host
variable that would instruct Ansible on where the machine actually lives.
This is a bit of a complication at first glance, but we think this is perfect for making things explicit. Our example is simple in that we have two machines with two external IP definitions, but this is not always the case, and Ansible (or, well, the collection developers) makes you decide on how you are going to connect to the machines yourself. Imagine these two machines having no externally-accessible IP address, but you have a VPN set up to the cloud network, so you could just use those directly. The inventory plugin has no way of knowing those details. Similarly, for executing via jump hosts or directly in a cloud machine, you decide how to connect.
Connecting to the machines
So what we need to do is instruct the inventory plugin to define ansible_host
to the external IPs we have created. The user to connect as can also be specified, along with anything else you would like. For flexibility, the compose
section is rather verbose. We also limit the attributes
from the defaults somewhat so our printouts are a bit more readable.
plugin: community.digitalocean.digitalocean
attributes:
- id
- name
- tags
- networks
compose:
ansible_host: do_networks.v4 | selectattr('type','eq','public') | map(attribute='ip_address') | first
ansible_user: "'centos'"
ansible_ssh_common_args: "'-o StrictHostKeyChecking=no'"
The compose
key adds variables to each host. Key names are variable names, while their values are Jinja2 expressions. Here, we use multiple variables and filters to, in sequence,
- select IPv4 address definitions,
- filter them by only including those whose
type
ispublic
, - extracting the actual IPv4 address out of the complete definition and lastly
- selecting the first address out of the bunch.
You can see how flexible this is—you can fit it to anything you want. If you need any other variables, you can set them in the compose
section in the same way. We’re doing this to disable host key checking. This should not be done for real workloads, but as we’re only playing around, it’s completely fine.
Let’s try connecting.
$ ansible -m ping -i digitalocean.yml all
steampunk-dbserver-1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/libexec/platform-python"
},
"changed": false,
"ping": "pong"
}
steampunk-appserver-1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/libexec/platform-python"
},
"changed": false,
"ping": "pong"
}
It works!
Remember how we’ve deployed an “application” and “database” server? We can make Ansible group these servers up for us based on the tags.
plugin: community.digitalocean.digitalocean
attributes:
- id
- name
- tags
- networks
keyed_groups:
- key: do_tags | lower
prefix: ""
separator: ""
compose:
ansible_host: do_networks.v4 | selectattr('type','eq','public') | map(attribute='ip_address') | first
ansible_user: "'centos'"
ansible_ssh_common_args: "'-o StrictHostKeyChecking=no'"
Running ansible-inventory
now produces a slightly different output, where the hosts are grouped.
$ ansible-inventory -i digitalocean.yml --graph --vars
@all:
|--@app:
| |--steampunk-appserver-1
| | |--{ansible_host = 207.154.233.128}
| | |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
| | |--{do_id = 243296785}
| | |--{do_name = steampunk-appserver-1}
| | |--{do_networks = {'v4': [{'ip_add [...] ': []}}
| | |--{do_tags = ['steampunk-test', 'app']}
|--@db:
| |--steampunk-dbserver-1
| | |--{ansible_host = 46.101.215.219}
| | |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
| | |--{do_id = 243296841}
| | |--{do_name = steampunk-dbserver-1}
| | |--{do_networks = {'v4': [{'ip_add [...] ': []}}
| | |--{do_tags = ['steampunk-test', 'db']}
|--@steampunk_test:
| |--steampunk-appserver-1
| | |--{ansible_host = 207.154.233.128}
| | |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
| | |--{do_id = 243296785}
| | |--{do_name = steampunk-appserver-1}
| | |--{do_networks = {'v4': [{'ip_add [...] ': []}}
| | |--{do_tags = ['steampunk-test', 'app']}
| |--steampunk-dbserver-1
| | |--{ansible_host = 46.101.215.219}
| | |--{ansible_ssh_common_args = -o StrictHostKeyChecking=no}
| | |--{do_id = 243296841}
| | |--{do_name = steampunk-dbserver-1}
| | |--{do_networks = {'v4': [{'ip_add [...] ': []}}
| | |--{do_tags = ['steampunk-test', 'db']}
|--@ungrouped:
Awesome! We can now execute Ansible commands, or playbooks, on a subset of hosts, defined completely by their tags. Let’s ping only the application servers. We’re using a simple ping
because the point here is the inventory, not deploying applications.
$ ansible -m ping -i digitalocean.yml app
steampunk-appserver-1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/libexec/platform-python"
},
"changed": false,
"ping": "pong"
}
Now, let’s modify the playbook we initially used to include one more application server.
...
loop:
- type: appserver
index: 1
tags:
- steampunk-test
- app
- type: appserver
index: 2
tags:
- steampunk-test
- app
- type: dbserver
index: 1
tags:
- steampunk-test
- db
...
And let’s now create the new droplet and run the same ping command against all application servers.
$ ansible-playbook playbook.yml
$ ansible -m ping -i digitalocean.yml app
steampunk-appserver-1 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/libexec/platform-python"
},
"changed": false,
"ping": "pong"
}
steampunk-appserver-2 | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/libexec/platform-python"
},
"changed": false,
"ping": "pong"
}
The command was executed against both servers, without any inventory modification necessary!
Cleaning up
Cleaning up after ourselves is quite simple with Ansible. Change all occurrences of state: present
in the playbook into state: absent
and just run it again! In our case, we also have to remove the ssh_keys
definition, since it includes a variable sourced from the previous task, and it won’t be available on deletion. That’s it!
Ansible’s dynamic inventory is stateless
Ansible’s dynamic inventory operates on state. That state is remote, what actually exists. Apart from local caching, that is always the ground truth. No matter how instances are created, modified or deleted, Ansible inventory scripts never go out of sync.
We’ve shown a very simple example of using dynamic inventories. New possibilities arise when you have more machines, since you can codify complex scenarios. You can bind infrastructure provisioning with application deployment using framework-agnostic tags, so you could even mix and match the tools you use for both.
Ansible can be used for more than just infrastructure management! If you are interested in learning more, you can check out this post about why cloud automation is its forte.
This is the first part of a multi-part series on cloud automation with Ansible. Stay tuned for a writeup on inventory caching, very useful with complex workflows on many nodes
Posted on May 12, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.