Use Ansible to create and start LXD virtual machines
Ákos Takács
Posted on March 11, 2024
The next step in our home lab is to finally have an Ansible playbook to create and start a virtual machine. It also means that eventually we will have different kind of dynamically created servers into which we probably want to SSH. In this chapter I will show you a way to create a LXD virtual machine on Ubuntu and also automatically configure your host to be able to SSH into the new virtual machine. In this capter we will assume that we still don't want to manage these virtual machines from Ansible, but that will be the final goal.
Note: In the original post and video I used a different LXD remote image repository. I had to change it, because that repository was replaced surprisingly in an active LTS LXD release. That's because of the original developers of LXD started Incus and they stopped supporting LXD in their image repository. There is a new repo, which does not include Ubuntu server images currently, only desktop.
If you want to run the playbook called playbook-lxd-install.yml, you will need to configure a physical or virtual disk which I wrote about in The simplest way to install LXD using Ansible. If you don't have a usable physical disk, Look for truncate -s 50G <PATH>/lxd-default.img to create a virtual disk.
How you activate the virtual environment, depends on how you created it. In the episode of The first Ansible playbook describes the way to create and activate the virtual environment using the "venv" Python module and in the episode of The first Ansible role we created helper scripts as well, so if you haven't created it yet, you can create the environment by running
Although in this tutorial I used only one server, you could add multiple servers to the inventory file, but you probably don't want to create a specific VM on each server.
You could have a playbook in which you define a specific host instead of an inventory group. Not a very good way in terms of redundancy.
You could define a list under the vars: section of a specific host in the inventory file, in which all the items would be a mapping of key-value pairs with the configurations for the virtual machine you want to create on that host and handle that list in a loop in the Ansible role responsible for creating the VM. That's much better, but too complicated for this tutorial.
For now, I will create a playbook for creating a general virtual machine which allows using Docker, so we could install it on multiple hosts. Just imagine that you want to be able to use Docker on all of your Linux hosts in a VM to test network connections between them. I know, it is hard to imagine that, so we will need a list of servers which could include a single server, so you will run Docker in a VM only on one machine.
It's time to change our inventory file to use new inventory groups in addition to the special group called "all". As a reminder, see my old inventory file:
In the new inventory file, we will have an additional group:
That's it, defined in the end of the previous inventory file. Since we have all the parameters for the host in the group "all", we don't have to define any in the new group, but we still have to add the name of the host machine without values.
When we create our new playbook, in the "hosts:" section we will refer to "docker_vm_host_machines" instead of "all"
Check how Ansible interprets the inventory file
Wrapper script to run any command in the Nix environment
When I started this tutorial series, I thought we would need
only the ansible-playbook command. I was wrong. Now we should try the ansible-inventory command and also run an ad-hoc Ansible command. Instead of creating two more wrapper scripts only for the new commands, I will create a script called which can run any.
I could use this script from all the other wrapper scripts to reduce redundancy, but I don't want to change many files in this chapter, so let's keep it for another day.
Make the script executable:
chmod +x
And run the following command to get the version of Ansible:
Using the "ansible" command, we can use the "debug" module to get the value of ansible_host. We will tell Ansible to run task in the group called "docker_vm_host_machines", but it will get the parameter defined in "all".
Let's create the skeleton of the playbook and call the file "playbook-lxd-docker-vm.yml in the project root.".
# region play: Create the VM-name:Create the VMhosts:docker_vm_host_machinespre_tasks:vars:tasks:# endregion
I used a special comment syntax which is supported in JetBrains IDEs and also in Visual Studio code. These are what I use most of the time. If it doesn't work for you in VSCode by default, you can try the extension called region folding for VSCode. This syntax allows us to collapse different regions of the code and assign a name to it which will be shown in the collapsed state. It will be useful when we will have multiple plays, and we want to see the name when the play is collapsed instead of something like <4 keys>. I will use this for tasks as well.
You can see the new inventory group set for the play.
Now we will need an Ansible role to create and start a virtual machine. Since LXD was originally for containers, the module still called community.general.lxd_container. Let's see the skeleton of the task in the tasks section of the playbook containing only the static parameters.
Common parameters without variables for starting a VM
tasks:# region task: Start Docker VM-name:Start Docker VMbecome:truecommunity.general.lxd_container:url:unix:/var/snap/lxd/common/lxd/unix.sockettype:virtual-machinestate:startedwait_for_ipv4_addresses:trueconfig:boot.autostart:"false"# endregion
We need to be able to communicate with the LXD daemon, so we define the path of the LXD unix socket file in the url parameter.
We want a virtual machine, so we set the type to virtual-machine.
We want the task return only when the virtual machine got an IP address already, so we can continue with other tasks that require that IP address.
This is a test virtual machine, so we don't want it to start when the host starts, so boot.autostart should be false in the config section.
When we run the playbook, we don't want to start the VM manually, so we set the state to started.
Of course, we want set a source. A source is a collection of parameters containing how and from where LXD should download the base image for the virtual machine.
The value of these parameters will come from the inventory file, but I wanted to keep this task short, so we will set the variables in the "vars" section of the play:
# region play: Create the VM-name:Create the VMhosts:docker_vm_host_machinesvars:vm_name:"{{config_lxd_docker_vm_name|default('docker',true)}}"vm_memory:"{{config_lxd_docker_vm_memory|default('4GiB',true)}}"vm_cpu:"{{config_lxd_docker_vm_cpus|default(4,true)}}"pre_tasks:tasks:# endregion
This way we have default values even if the config parameters are not defined in the inventory file or when defined with empty values (that is what "true" for as second parameter). I defined only the name of the virtual machine in the inventory and let Ansible use the default CPU and memory limits. Now we got to another interesting part.
Cloud-init config for SSH access and sudo password
We can create a virtual machine with the already described parameters, but without cloud-init configs we have no users in the virtual machine, not to mention the SSH configuration for that user. We want the following cloud-init config:
First of all we needed to install the openssh-server package and then define a list of users. There will be only one user by default and that user will be in the "sudo" group. The user's default shell will be the bash shell, and we define a password and a list of SSH public keys. The password comes from the vm_pass variable, and we will need the "file" lookup plugin to read the public key defined in the vm_ssh_pub_key variable. Let's add it to the task:
# region play: Create the VM-name:Create the VMhosts:docker_vm_host_machinesvars:# ...vm_user:"{{config_lxd_docker_vm_user|default('manager',true)}}"vm_pass:"{{config_lxd_docker_vm_pass|ansible.builtin.password_hash(salt=vm_pass_salt)}}"vm_ssh_pub_key:"{{config_lxd_docker_vm_ssh_pub_key|default(vm_ssh_priv_key+'.pub',true)}}"pre_tasks:tasks:# endregion
The default user will be "manager" and there will be no default password, but we need to use ansible.builtin.password_hash to convert the plain text password to a hash. To use this filter, you need to add the following line to the requirements.txt:
passlib==1.7.4 # for using ansible.builtin.password_hash()
Don't forget to run pip install -r requirements.txt to install the new library.
Alternatively, you could remove the filter and pass a password hash directly. The path of the SSH public key will come from the path of the private key, plus .pub as extension, but we still let the user override it. You might have noticed that we have an argument for password_hash called "salt" and the value will come from the variable vm_pass_salt. That means we will need to set two more variables.
# region play: Create the VM-name:Create the VMhosts:docker_vm_host_machinesvars:# ...vm_pass_salt:"{{config_lxd_docker_vm_pass_salt}}"vm_ssh_priv_key:"{{config_lxd_docker_vm_ssh_priv_key}}"pre_tasks:tasks:# endregion
Note that the order of the variables does not matter. If you prefer defining the variables first which will be used by other variables, that's fine. If you want to start with the variables that you will eventually refer to in the tasks, that's okay too. For me, it felt easier to explain the variables in this order. You can also notice that there are no default values here, so we add pre tasks to check the required variables:
pre_tasks:-name:Fail if config_lxd_docker_vm_ssh_priv_key is not definedwhen:config_lxd_docker_vm_ssh_priv_key | default('', true) == '' must set an SSH private key to log in to the VM-name:Fail if config_lxd_docker_vm_pass_salt is not definedwhen:config_lxd_docker_vm_pass_salt | default('', true) == '' must set a password salt for the sudo password to log in to the VM
Now our whole play looks like this:
# region play: Create the VM-name:Create the VMhosts:docker_vm_host_machinesvars:vm_name:"{{config_lxd_docker_vm_name|default('docker',true)}}"vm_memory:"{{config_lxd_docker_vm_memory|default('4GiB',true)}}"vm_cpu:"{{config_lxd_docker_vm_cpus|default(4,true)}}"vm_user:"{{config_lxd_docker_vm_user|default('manager',true)}}"vm_pass:"{{config_lxd_docker_vm_pass|ansible.builtin.password_hash(salt=vm_pass_salt)}}"vm_ssh_pub_key:"{{config_lxd_docker_vm_ssh_pub_key|default(vm_ssh_priv_key+'.pub',true)}}"vm_pass_salt:"{{config_lxd_docker_vm_pass_salt}}"vm_ssh_priv_key:"{{config_lxd_docker_vm_ssh_priv_key}}"pre_tasks:-name:Fail if config_lxd_docker_vm_ssh_priv_key is not definedwhen:config_lxd_docker_vm_ssh_priv_key | default('', true) == '' must set an SSH private key to log in to the VM-name:Fail if config_lxd_docker_vm_pass_salt is not definedwhen:config_lxd_docker_vm_pass_salt | default('', true) == '' must set a password salt for the sudo password to log in to the VMtasks:# region task: Start Docker VM-name:Start Docker VMbecome:truecommunity.general.lxd_container:url:unix:/var/snap/lxd/common/lxd/unix.sockettype:virtual-machinestate:startedwait_for_ipv4_addresses:truename:"{{vm_name}}"source:type:imageserver:"22.04"config:boot.autostart:"false"limits.cpu:"{{vm_cpu}}"limits.memory:"{{vm_memory}}"cloud-init.user-data:|#cloud-configusers:- name: {{ vm_user }}lock_passwd: falsegroups: sudoshell: /bin/bashpasswd: "{{ vm_pass }}"ssh_authorized_keys:- {{ lookup('file', vm_ssh_pub_key) }}packages:- openssh-server# endregion# endregion
That is a completely fine play in a playbook, but we have some undefined variables in the play which we want to set or other variables we want to override. Let's add the following required variables to our inventory file.
In my case, the SSH private key is the same as I used for the host. I could use a template here too to get the value from the other variable, but I don't recommend reading values from Ansible's built-in variables, because it could lead to some confusion when you change how you access the host, and you forget that it was used somewhere else too. You could introduce a third variable like config_global_ssh_priv_key and read the value from it in the value of the other two variables. The password and the salt parameter comes from a secret. To add the variables, if your secret is already encrypted, use the helper script created for sops.
./ secret.yml
This is the content of my encrypted secrets.yml in the project root:
The IP address will be different on your machine, and I always like to share commands that will run the same way for everyone, so let's get the IP address automatically.
Now the $ip_addresses variable probably contains a single IP address, unless the virtual machine has multiple networks. In this tutorial we know that we added only one, but later we could change it, so why not make it work with multiple networks as well.
network="$(echo"$vm_info" | jq -r'')"ip_range="$(lxc network show "$network" | yq '.config["ipv4.address"]')"
With the above two lines we will get the CIDR notation of the network address. This is what I have in $ip_range:
The IP address in the value is the IP of the gateway, but the gateway and the mask size (24) together describes an IP range which has to include the IP we are looking for. With this specific mask size we could check that the IP address starts with 10.17.181., but later we could define a different network with another mask size, so this is when we could use grepcidr. On Ubuntu, we can install it this way:
In the previous sections we learned about SSH-ing to the virtual machine from the host on which the VM is running and also about detecting the IP address of the VM automatically. It would be easier if we could use the SSH client with the SSH keys on the Ansible controller to log in to the virtual machine. One way could be changing the network settings of the virtual machine to get an IP address on the LAN network instead of on the local LXD bridge (lxdbr0). That would work if you have a local network, and you can manage the IP addresses to assign one to a virtual machine. It is not always the case, so I have chosen a different approach. We will keep our current IP address and learn about proxies and how SSH can help us again.
You could use an LXD proxy device, but there are multiple requirements.
The configuration is different for virtual machines and containers.
You would need a static IP address which is recommended anyway, so it won't be a problem in the future, but for now, we are experimenting with dynamically assigned IP addresses.
Without firewall settings, you would make the port available from all machines. In a local homelab that is probably fine, but I prefer another solution which is coming in the following sections.
My first idea was simply using an SSH tunnel. I do it frequently. It is basically SSH-ing to a host and use that SSH connection to forward a request from a specific port of the client through the host to an endpoint which is accessible from the host. Note that my remote server's hostname is still ta-lxlt and the IP address of the virtual machine is I open a terminal and run the following command:
CLIENT_IP is optional, but I want to make sure the port is accessible only from localhost. If you know how port forwarding works with Docker, this is very similar, except Docker doesn't need the endpoint ip, since it is always the IP of the container. -N allows me to keep the SSH connection without actually executing any command on the remote server. I open a new terminal and run the following command:
ssh -p 2000 manager@
If you don't want it to ask for the password, use the SSH key:
Although this is not what we will use for the SSH connection, SSH tunnels can help you with accessing other ports as well, like a web application running inside the virtual machine.
You know that you can SSH to the remote server and from that you can SSH to the virtual machine. You also know that you can pass a command to SSH to execute on the remote server, so you can also run another SSH command.
ssh -t ta-lxlt -- ssh manager@
-t was required because the SSH command running on the remote server required a pseudo-terminal. We can also try to SSH to the virtual machine and pass the address of a jump server. A jump server is a server to which we have access, and from which we have access to another server. In our case, the virtual machine.
ssh -J ta-lxlt manager@
This jump server works only because I configured that host in my SSH client config ($HOME/.ssh/config):
Host ta-lxlt ta-lxlt.lan
Port 22
User ta
IdentityFile ~/.ssh/ta-lxlt
We can add our virtual machine too. We will need a name that we can pass to the SSH client. It could be the internal full-qualified hostname of the VM with the remote server's hostname as suffix. Something like this:
Host docker.lxd.ta-lxlt
The internal full qualified hostname is the name of the VM ending with ".lxd". You can confirm it by running the following command on the remote server:
lxc exec docker --hostname-A
Host docker.lxd.ta-lxlt
User manager
IdentityFile ~/.ssh/ansible
ProxyJump ta-lxlt
Now that we have the ProxyJump parameter defined in the SSH client config, the following command on the Ansible controller is enough to log in to the virtual machine:
ssh docker.lxd.ta-lxlt
That way you don't have to remember the IP address every time. We will automate the SSH client configuration of the virtual machine, so even if the IP address changes, you can still log in using the same hostname.
Using separate config files for different kind of servers
Using $HOME/.ssh/config for all your SSH client configs works, but I had so many servers (physical, virtual, behind VPN connections) that I found it hard to maintain the config file. Fortunately with recent SSH versions, you can include other config files in the main SSH config. For example, this is how my main config looks like:
Include config.d/homelab
Include config.d/external-services
Include config.d/home
Include config.d/multipass
Include config.d/docker.lxd.ta-lxlt
The configuration of ta-lxlt is in config.d/home, I also have an old swarm cluster client config in config.d/homelab, but I found it better to include the dynamically generated client config for the virtual machine directly in the main config.
Since config.d is not created by default,we need to make sure it exists. This is hour following task in the playbook:
# region task: Create base dir for the new SSH client config file-name:Create base dir for the new SSH client config filedelegate_to:localhostansible.builtin.file:state:directorypath:"{{lookup('env','HOME')}}/.ssh/config.d/"# endregion
delegate_to is used to run the task on a specific host regardless of where the rest of the tasks were running, and we can use the env lookup plugin to get the home of our local user. You know everything else from the previous episodes.
# region task: Get VM info-name:Get VM infobecome:truechanged_when:falseansible.builtin.command:lxc list "^{{ vm_name }}$" --format jsonregister:_vm_info_command# endregion
We save the info in a registered variable. Here comes the fun part. We already learned that the order of the variables doesn't matter. In order to avoid using blocks and indenting our playbook deeper, we can add our helper variables to the vars section of the playbook. I will keep using the underscore character as a prefix.
To get the IP addresses in Ansible, we can use a very similar approach to what we used in the terminal. The difference is that we used jq in the terminal (we installed it in Using facts and the GitHub API in Ansible), and we will use Jinja filters in Ansible. Let's add the following variable to the vars section of the playbook.
Now that we know the name of the network, we will need to find out what subnet the network is using. Let's add the following task to the playbook:
# region task: Get network info-name:Get network infobecome:truechanged_when:falseansible.builtin.command:lxc network show "{{ _network }}"register:_network_info_command# endregion
Remember, lxc network show returns a YAML output, not json. So we need to add the following variable to the vars section to the playbook.
Now that we have everything about the LXD network, we can get the CIDR notation of the subnet and the actual IP address of the VM. We will use a new filter in Ansible, called ansible.utils.reduce_on_network. It is basically the alternative to grepcidr which we used in the terminal. Let's add the following variables to the vars section of the playbook:
Because I'm super careful, I also filter to the first item in the list, so even if for some reason there are multiple lines, I will deal with only one in the next steps. This filter requires the following line in the requirements.txt.
netaddr==0.9.0 # for using ansible.utils.reduce_on_network()
Don't forget to run pip install -r requirements.txt to install the new library.
We had to delegate it to localhost again. We used the good old copy module save the content defined in the task to the destination file. There is one variable missing and that is the name of the file which will also be the host in the client config. Let's add the following variable to the vars section of the playbook:
Note that the default suffix is the inventory hostname of the remote server. This is the name which you use in the inventory file under the hosts section. For me, it is the same as the hostname of the remote server, but it could be different. If you want to override the generated name, you can set config_lxd_docker_vm_inventory_hostname in the inventory file.
As a final step, we have to include the generated config file in the main config:
# region task: Include the new SSH client config in the main config-name:Include the new SSH client config in the main configdelegate_to:localhostansible.builtin.lineinfile:state:presentcreate:truepath:"{{lookup('env','HOME')}}/.ssh/config"line:"Includeconfig.d/{{vm_inventory_hostname}}"# endregion
We use the lineinfile module to add a new line to the main config. Since it is also possible that you don't have that main config either (although at this point it is unlikely), create: true makes sure the file is created if necessary. We also set the filepath in path and the line in the line parameter.
Run the playbook to generate the SSH client config
Now you can finally run the playbook. In case you couldn't follow the steps, this is the full playbook:
# region play: Create the VM-name:Create the VMhosts:docker_vm_host_machinesvars:vm_name:"{{config_lxd_docker_vm_name|default('docker',true)}}"vm_memory:"{{config_lxd_docker_vm_memory|default('4GiB',true)}}"vm_cpu:"{{config_lxd_docker_vm_cpus|default(4,true)}}"vm_user:"{{config_lxd_docker_vm_user|default('manager',true)}}"vm_pass:"{{config_lxd_docker_vm_pass|ansible.builtin.password_hash(salt=vm_pass_salt)}}"vm_ssh_pub_key:"{{config_lxd_docker_vm_ssh_pub_key|default(vm_ssh_priv_key+'.pub',true)}}"vm_pass_salt:"{{config_lxd_docker_vm_pass_salt}}"vm_ssh_priv_key:"{{config_lxd_docker_vm_ssh_priv_key}}"vm_inventory_hostname:"{{config_lxd_docker_vm_inventory_hostname|default(vm_name+'.lxd.'+inventory_hostname,true)}}"_vm_info:"{{_vm_info_command.stdout|from_json|first}}"_ip_addresses:"{{|dict2items|map(attribute='value.addresses')|flatten|selectattr('family','equalto','inet')|selectattr('scope','equalto','global')|map(attribute='address')}}"_network:"{{}}"_network_info:"{{_network_info_command.stdout|from_yaml}}"_ip_range:"{{_network_info.config['ipv4.address']}}"_ip:"{{_ip_addresses|ansible.utils.reduce_on_network(_ip_range)|first}}"pre_tasks:-name:Fail if config_lxd_docker_vm_ssh_priv_key is not definedwhen:config_lxd_docker_vm_ssh_priv_key | default('', true) == '' must set an SSH private key to log in to the VM-name:Fail if config_lxd_docker_vm_pass_salt is not definedwhen:config_lxd_docker_vm_pass_salt | default('', true) == '' must set a password salt for the sudo password to log in to the VMtasks:# region task: Start Docker VM-name:Start Docker VMbecome:truecommunity.general.lxd_container:url:unix:/var/snap/lxd/common/lxd/unix.sockettype:virtual-machinestate:startedwait_for_ipv4_addresses:truename:"{{vm_name}}"source:type:imageserver:"22.04"config:boot.autostart:"false"limits.cpu:"{{vm_cpu}}"limits.memory:"{{vm_memory}}"cloud-init.user-data:|#cloud-configusers:- name: {{ vm_user }}lock_passwd: falsegroups: sudoshell: /bin/bashpasswd: "{{ vm_pass }}"ssh_authorized_keys:- {{ lookup('file', vm_ssh_pub_key) }}packages:- openssh-server# endregion# region task: Create base dir for the new SSH client config file-name:Create base dir for the new SSH client config filedelegate_to:localhostansible.builtin.file:state:directorypath:"{{lookup('env','HOME')}}/.ssh/config.d/"# endregion# region task: Get VM info-name:Get VM infobecome:truechanged_when:falseansible.builtin.command:lxc list "^{{ vm_name }}$" --format jsonregister:_vm_info_command# endregion# region task: Get network info-name:Get network infobecome:truechanged_when:falseansible.builtin.command:lxc network show "{{ _network }}"register:_network_info_command# endregion# region task: Add SSH client config-name:Add SSH client configdelegate_to:localhostansible.builtin.copy:dest:"{{lookup('env','HOME')}}/.ssh/config.d/{{vm_inventory_hostname}}"content:|Host {{ vm_inventory_hostname }}Hostname {{ _ip }}User {{ vm_user }}IdentityFile {{ vm_ssh_priv_key }}ProxyJump {{ inventory_hostname }}# endregion# region task: Include the new SSH client config in the main config-name:Include the new SSH client config in the main configdelegate_to:localhostansible.builtin.lineinfile:state:presentcreate:truepath:"{{lookup('env','HOME')}}/.ssh/config"line:"Includeconfig.d/{{vm_inventory_hostname}}"# endregion
And the command to run it:
./ playbook-lxd-docker-vm.yml
After that, you can log in to the VM with the following command from the Ansible controller:
And open from a web browser or run the following curl command on the Ansible controller:
My output:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" ""><html><head><metahttp-equiv="Content-Type"content="text/html; charset=utf-8"><title>Directory listing for /</title></head><body><h1>Directory listing for /</h1><hr><ul><li><ahref=".bash_history">.bash_history</a></li><li><ahref=".bash_logout">.bash_logout</a></li><li><ahref=".bashrc">.bashrc</a></li><li><ahref=".cache/">.cache/</a></li><li><ahref=".profile">.profile</a></li><li><ahref=".ssh/">.ssh/</a></li></ul><hr></body></html>
Just think about what happened here:
You SSH-d to a remote server
Through that remote server you SSH-d to the virtual machine directly from the Ansible controller
You opened a tunnel for a web application from the Ansible controller to the localhost of the virtual machine.
Yes, you are right, configuring a LAN IP would have been easier, but not an option for everyone, and if you ask me, less fun ass well.
I think it's important to mention again that our current solution is not ideal. We use a dynamically assigned IP address for the virtual machine, so even though the IP address will not change every time you restart the virtual machine, it is not dedicated to this VM. That means, when you can't access the virtual machine from the Ansible controller, you need to run the playbook again, even if the virtual machine is already created. For servers, we usually use static IP addresses, but to choose the right IP address could be another challenge, so for now, we used a dynamically assigned IP. That's how we learn step by step.
The final source code of this episode can be found on GitHub:
Source code to create a home lab. Part of a video tutorial
This project was created to help you build your own home lab where you can test
your applications and configurations without breaking your workstation, so you can
learn on cheap devices without paying for more expensive cloud services.
The project contains code written for the tutorial, but you can also use parts of it
if you refer to this repository.
Note: The inventory.yml file is not shared since that depends on the actual environment
so it will be different for everyone. If you want to learn more about the inventory file
watch the videos on YouTube or read the written version on Links in
the video descriptions on YouTube.
You can also find an example inventory file in the project root. You can copy that and change
the content, so you will use your IP…