Experiment with Vagrant and Ansible - Docker Swarm for Small Self-hosted Projects

xnok

Alexandre Couedelo

Posted on February 24, 2022

Experiment with Vagrant and Ansible - Docker Swarm for Small Self-hosted Projects

Experimenting with Docker Swarm and having only a single node is a bit sad 😞. Luckily in my previous tutorial, you learn how to create *A Disposable Local Test Environment using Vagrant and Ansible.* If you followed along you know a little bit more about Vagrant and Ansible *but nothing worth showing off* 🤯*,* so let up our game and create a multi-VM Docker Swarm cluster.

This involves using Vagrant to create multiple VM, then using Ansible to install docker on each machine, before creating a Docker Swarm cluster with all our nodes. On this is in place you have a solid foundation to experiment with Docker

I want to remind you that the goal of this tutorial series is to document what I consider the bare minimum for a small self-hosted side project. I invite you to visit my repository for more information: https://github.com/xNok/infra-bootstrap-tools. At this point, we are doing the groundwork of setting up a server to host the application we will deploy later as docker containers.

Provisioning Multiple VMs with Vagrant

Like in the previous tutorial we use Vagrant to create virtual machines. The difference is that this time we are provisioning 3 VMs so the file became sensibly bigger. I will explain the content of this file in the next section.

Here is the new updated Vagrantfile . Before running vagrant up there are two more things you need to set up: ansible.cfg and inventory The code will be below this big Vagrantfile .

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  # Every Vagrant development environment requires a box. You can search for
  # boxes at https://vagrantcloud.com/search.
  config.vm.box = "generic/ubuntu2004"

  # We are moving to a more complex example so to avoid issues we will limit the RAM of each VM
  config.vm.provider "virtualbox" do |v|
    v.memory       = 1024
    v.cpus         = 1
    v.linked_clone = true
  end

  #########
  # Nodes: host our apps 
  #########

  config.vm.define "node1" do |node|
    node.vm.network "private_network", ip: "172.17.177.21"
  end

  config.vm.define "node2" do |node|
    node.vm.network "private_network", ip: "172.17.177.22"
  end

  #########
  # Controller: host our tools
  #########
  config.vm.define 'controller' do |machine|

    # The Ansible Local provisioner requires that all the Ansible Playbook files are available on the guest machine
    machine.vm.synced_folder ".", "/vagrant",
       owner: "vagrant", group: "vagrant", mount_options: ["dmode=755,fmode=600"]

    # /!\ This is only usefull because the tutorial files are under .articles/xyz
    # otherwise Ansible would get the roles from the root folder
    machine.vm.synced_folder "../../roles", "/vagrant/roles",
      owner: "vagrant", group: "vagrant", mount_options: ["dmode=755,fmode=600"]

    machine.vm.network "private_network", ip: "172.17.177.11"

    machine.vm.provision "ansible_local" do |ansible|
      # ansible setup
      ansible.install         = true
      ansible.install_mode    = "pip_args_only"
      ansible.playbook        = "playbook.yml"
      # ansible.version = "2.10.7"
      ansible.pip_install_cmd = "sudo apt-get install -y python3-pip python-is-python3 haveged && sudo ln -s -f /usr/bin/pip3 /usr/bin/pip"
      ansible.pip_args        = "ansible==2.10.7"
      # provsionning
      ansible.playbook        = "playbook.yml"
      ansible.verbose         = true
      ansible.limit           = "all" # or only "nodes" group, etc.
      ansible.inventory_path  = "inventory"
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

So the two last things you need 😅. Create an ansible.cfg file, we are fine-tuning ansible configuration to work with our setup. You won’t have an interactive shell so we won’t be able to accept SSH fingerprints. This configuration will also be essential to have ansible your working in your CI/CD since we are facing the same constraint.

[defaults]
host_key_checking = no

[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes
Enter fullscreen mode Exit fullscreen mode

Last we need to manually define the inventory file. Since we selected the IPs in the private network this is a simple task. Not that we also take advantage of our synced_folder to obtain the SSH keys required for ansible to connect to node1 and node2 .

node1      ansible_host=172.17.177.21 ansible_ssh_private_key_file=/vagrant/.vagrant/machines/node1/virtualbox/private_key
node2      ansible_host=172.17.177.22 ansible_ssh_private_key_file=/vagrant/.vagrant/machines/node2/virtualbox/private_key
controller ansible_host=172.17.177.11 ansible_connection=local

[nodes]
node[1:2]

[managers]
controller
Enter fullscreen mode Exit fullscreen mode

Now you can provision the infra with Vagrant

Vagrant up
Enter fullscreen mode Exit fullscreen mode

Focus on the Vagrantfile

First, we select the Vagrant box we use as a base. This time I use ubuntu instead (generic/ubuntu2004 ) I found it easier for installing the latest version of Ansible on the controller. Notice that I added virtualbox specific configurations. Since you are running multiples VMs it is important to control the size of each VM as to not starve your PC resources. Also, I used the linked_clone option to speed up the process, that way VirtualBox will create a base VM (that will stay turned off) and clone this VM to create the other three.

  # Every Vagrant development environment requires a box. You can search for
  # boxes at https://vagrantcloud.com/search.
  config.vm.box = "generic/ubuntu2004"

  # We are moving to a more complex example so to avoid issues we will limit the RAM of each VM
  config.vm.provider "virtualbox" do |v|
    v.memory       = 1024
    v.cpus         = 1
    v.linked_clone = true
  end
Enter fullscreen mode Exit fullscreen mode

Next, we have the two worker node definition. This step is straightforward. What is new here is that we set fix IPs to our VM, this makes it easier to create a static Ansible inventory.

  #########
  # Nodes: host our apps 
  #########

  config.vm.define "node1" do |node|
    node.vm.network "private_network", ip: "172.17.177.21"
  end

  config.vm.define "node2" do |node|
    node.vm.network "private_network", ip: "172.17.177.22"
  end
Enter fullscreen mode Exit fullscreen mode

Before starting with the controller I want you to look at the Vagrant documentation and notice that there is two Ansible provider ansible and ansible_local . I used the second one so I don't have to bother installing ansible and I find that this approach is closer to the CI/CD approach you will use later in the series. As a result, to create two nodes we will provision three machines one of which is the controller and has the responsibility of running ansible and provisioning the other machines.

First, we create two synced_folder to give the VM access to our playbook and roles. That way we can update any Ansible code and use it immediately in the VM. Note that to avoid permission issues I forced the uid and guid as well as restricting files read/write access to the user only. The reason is that Ansible uses SSH keys stored in this folder (see inventory file) and permission for those keys needs to be that way.

  #########
  # Controller: host our tools
  #########
  config.vm.define 'controller' do |machine|

    # The Ansible Local provisioner requires that all the Ansible Playbook files are available on the guest machine
    machine.vm.synced_folder ".", "/vagrant",
       owner: "vagrant", group: "vagrant", mount_options: ["dmode=755,fmode=600"]

    # /!\ This is only usefull because the tutorial files are under .articles/xyz
    # otherwise Ansible would get the roles from the root folder
    machine.vm.synced_folder "../../roles", "/vagrant/roles",
      owner: "vagrant", group: "vagrant", mount_options: ["dmode=755,fmode=600"]

    machine.vm.network "private_network", ip: "172.17.177.11"

    machine.vm.provision "ansible_local" do |ansible|
      # ansible setup
      ansible.install         = true
      ansible.install_mode    = "pip_args_only"
      ansible.playbook        = "playbook.yml"
      # ansible.version = "2.10.7"
      ansible.pip_install_cmd = "sudo apt-get install -y python3-pip python-is-python3 haveged && sudo ln -s -f /usr/bin/pip3 /usr/bin/pip"
      ansible.pip_args        = "ansible==2.10.7"
      # provsionning
      ansible.playbook        = "playbook.yml"
      ansible.verbose         = true
      ansible.limit           = "all" # or only "nodes" group, etc.
      ansible.inventory_path  = "inventory"
    end
  end
Enter fullscreen mode Exit fullscreen mode

The more complicated part comes in the provision section. I want to use the latest 2.x version of ansible to use the latest version of docker_swarm and docker_swarm_info modules. The issue is that ansible made a lot of structural changes between 2.7 and 2.10. So a little bit of hacking is required to install the desired version. I found this method on Github and it works like a charm.

Setting up Docker with Ansible

Our playbook is about to become a little bit more complicated on top of that installing docker is something you may want to reuse in several projects. I will assume you are somewhat familiar with Ansible and took the time to play a little bit with the hello-world playbook you used in the first tutorial.

There are multiple ways to create roles with ansibles but I want to keep is as simple as possible. But you should know that the recommended way to create roles is to use ansible-galaxy init . See the documentation here. The downside of this approach is that it creates a folder and files you may not use. Let’s keep things simple and create the minimal structure.

Ansible looks for a folder called roles and then a subfolder with the name of that role here docker , finally, the first thing Ansible does is to read the main.yml from the meta folder of that role to get collect metadata information about it.

mkdir -p roles/docker/meta
touch roles/docker/meta/main.yml
Enter fullscreen mode Exit fullscreen mode

The meta/main.yml only requires you to specify dependencies for this role, meaning other roles that you would expect to execute before this one.

dependencies: []
  # List your role dependencies here, one per line. Be sure to remove the '[]' above,
  # if you add dependencies to this list.
Enter fullscreen mode Exit fullscreen mode

Finally, we need to defines so tasks to complete the docker installation. It is a good practice exercise to look at the official docker installation documentation and turn it into an Ansible role: https://docs.docker.com/install/linux/docker-ce/debian/. Create the file /tasks/main.yaml

mkdir -p roles/docker/tasks
touch roles/docker/tasks/main.yml
Enter fullscreen mode Exit fullscreen mode

Then the content of main.yml should look along those lines:

#################################################
# OR INFRA Role: Docker
# Source: https://docs.docker.com/install/linux/docker-ce/debian/
#################################################
---
###
# GENERAL Setup
###
- name: Install required system packages
  apt: name={{ item }} state=latest update_cache=yes
  loop: [ 'apt-transport-https', 'ca-certificates', 'software-properties-common']

- name: Add Docker GPG apt Key
  apt_key:
    url: https://download.docker.com/linux/debian/gpg
    state: present

- name: Add Docker Repository
  apt_repository:
    repo: deb [arch=amd64] https://download.docker.com/linux/{{ansible_distribution | lower }} {{ansible_distribution_release}} stable
    state: present

- name: Update apt and install docker-ce
  apt: name={{ item }} state=latest update_cache=yes
  loop: ['docker-ce', 'docker-ce-cli', 'docker-compose', 'containerd.io']

- name: Ensure docker users are added to the docker group.
  user:
    name: "{{ item }}"
    groups: docker
    append: true
  with_items: [vagrant, ubuntu]

- name: Start docker
  service:
    name: docker
    state: started
    enabled: yes

########
# Testing Setup
# Pull, start, stop a hello-world container
########
- name: Pull default Docker image for testing
  docker_image:
    name: "hello-world"
    source: pull

- name: Create default containers
  docker_container:
    name: "hello-world"
    image: "hello-world"
    state: present

- name: Stop a container
  docker_container:
    name: "hello-world"
    state: stopped
Enter fullscreen mode Exit fullscreen mode

Update your playbook.yml file to specify that we want to use this role against all our VMs.

- name: This is a hello-world example
  hosts: all

  roles:
  - docker

  tasks:
    - name: Create a file called '/tmp/testfile.txt' with the content 'hello world'.
      copy:
        content: hello-world
        dest: /tmp/testfile.txt
Enter fullscreen mode Exit fullscreen mode

Now it is time to run Vagrant

vagrant up
Enter fullscreen mode Exit fullscreen mode

Once the provisioning is completed you should have three VMs with docker setup.

Setting up Docker Swarm with Ansible

To complete our setup we will need to create three more roles:

  • docker-swarm-controller will install the required python package on the host running Ansible to controller and manager the swarm. This includes notably the python docker package.
  • docker-swarm-manager will initialize the swam and join all the targeted nodes as manager
  • docker-swarm-node will join all the targeted nodes as workers nodes.

Here is the final Ansible playbook:

- name: This is the base requirement for all nodes
  hosts: all

  roles:
  - {name: docker, become: yes}

- name: This setup the Docker Swarm Manager
  hosts: managers

  roles:
  - {name: docker-swarm-controller, become: yes} # this role is for the host running Ansible to manager the swarm
  - {name: docker-swarm-manager, become: yes}    # this role is for creating the swarm and adding host as manager

- name: This setup nodes and join the swarm
  hosts: nodes

  roles:
  - docker-swarm-node # this role is for the host to join the swarm
Enter fullscreen mode Exit fullscreen mode

docker-swarm-controller

This role is straightforward I don’t think I need to comment on it.

#################################################
# OR INFRA Role: Docker Swarm Controller
# Machines running ansible need some special python package
# Source: 
#    https://github.com/arillso/ansible.traefik
#    https://geek-cookbook.funkypenguin.co.nz/ha-docker-swarm/traefik/
#################################################
---
###
# GENERAL Setup
###
- name: Install required system packages
  apt: name={{ item }} state=latest update_cache=yes
  loop: ['python3-pip', 'virtualenv', 'python3-setuptools']

- name: Install python stuff required
  pip:
      executable: pip3
      name: [jsondiff, passlib, docker]
Enter fullscreen mode Exit fullscreen mode

docker-swarm-manager

You need to be careful here you can only init a docker swarm once. As a convention, the first node of the group managers will be used as the founder of the swarm. Notice that this role uses a variable swarm_managers_inventory_group_name . I like my variables to be verbose 😂. We need to read facts about our nodes, this variable tells us what group in the inventory is used for managers

You may be wondering what hostvars[groups[swarm_managers_inventory_group_name][0]].result.swarm_facts.JoinTokens.Manager do? When Ansible executed Init a new swarm with default parameters we asked Ansible to register some information with register: result this is simply the path to collect the information about the join token that the other nodes need to join the swarm as a manager. Get join-token for manager nodes effectively persisted the join token on each of the managers as a fact. More about Ansible facts and Variables here.

#################################################
# OR INFRA Role: Docker Swarm Manager
# Source: 
#    https://github.com/arillso/ansible.traefik
#    https://geek-cookbook.funkypenguin.co.nz/ha-docker-swarm/traefik/
#################################################
---
###
# GENERAL Setup
###

###
# SWARM Setup
###
- name: Init a new swarm with default parameters
  docker_swarm:
    state: present
    advertise_addr: "{{ ansible_host }}"
  register: result
  when: inventory_hostname == groups[swarm_managers_inventory_group_name][0] # only on the first manager

###
# Manager Setup
###
- name: Get join-token for manager nodes
  set_fact:
    join_token_manager: "{{ hostvars[groups[swarm_managers_inventory_group_name][0]].result.swarm_facts.JoinTokens.Manager }}"

- name: Join other managers
  docker_swarm:
    state: join
    join_token: "{{ join_token_manager }}"
    advertise_addr: "{{ ansible_host }}"
    remote_addrs: "{{ groups[swarm_managers_inventory_group_name] | map('extract', hostvars, ['ansible_host']) | join(',') }}"
  when: inventory_hostname != groups[swarm_managers_inventory_group_name] # exclude the first manager
Enter fullscreen mode Exit fullscreen mode

docker-swarm-node

This role is very similar to the previous one except that this time we get the join worker token and register our node as workers.

#################################################
# OR INFRA Role: Docker Swarm Node
# Source: 
#    https://github.com/arillso/ansible.traefik
#    https://geek-cookbook.funkypenguin.co.nz/ha-docker-swarm/traefik/
#################################################
---
###
# GENERAL Setup
###
- name: Get join-token for worker nodes
  set_fact:
    join_token_worker: "{{ hostvars[groups[swarm_managers_inventory_group_name][0]].result.swarm_facts.JoinTokens.Worker }}"

###
# Add Nodes
###
- name: Add nodes
  docker_swarm:
    state: join
    advertise_addr: "{{ ansible_host }}"
    join_token: "{{ join_token_worker }}"
    remote_addrs: "{{ groups[swarm_managers_inventory_group_name] | map('extract', hostvars, ['ansible_host']) | join(',') }}"
Enter fullscreen mode Exit fullscreen mode

Testing that the Docker Swarm is working

Let’s see if everything looks ok in our cluster. SSH to the controller node:

vagrant ssh controller
Enter fullscreen mode Exit fullscreen mode

Use the command docker node ls to see your cluster

vagrant@ubuntu2004:~$ docker node ls
ID                            HOSTNAME                 STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
odumha179h5qbtln5jfoql9xc *   ubuntu2004.localdomain   Ready     Active         Leader           20.10.12
opeigd4zdccyzam3yjaakdfzk     ubuntu2004.localdomain   Ready     Active                          20.10.12
yjy282nbmzcr5gx90rvvacla2     ubuntu2004.localdomain   Ready     Active                          20.10.12
Enter fullscreen mode Exit fullscreen mode

Conclusion

Quickly setting up VMs and creating Ansible roles is the fastest way for me to test a simple setup at no cost. This is why Vagrant and Ansible make such a great team to create a Disposable Local Test Environment.

As of now, your Docker Swarm is totally empty. In future tutorials let's create a simple stack you can reuse for almost all your projects. You can check my Github repository https://github.com/xNok/infra-bootstrap-tools to find more tutorials and build the following infrastructure.

Infrastructure for small self-hosted project

Resolving common problems

Sometimes when provisioning multiple machine issues occur. You should not restart everything from ground zero but use the power of Ansible and Vagrant to resume operation from where the problem occurred.

When the provisioning fails (ansible error) you can restart the provisioning with:

vagrant provision controller
Enter fullscreen mode Exit fullscreen mode

It happened to me that an error occurred to a node (SSH errors or node unreachable) in that case reload only the node that creates problems.

vagrant reload node1
Enter fullscreen mode Exit fullscreen mode

References

https://github.com/geerlingguy/ansible-role-docker

https://github.com/ruanbekker/ansible-docker-swarm

https://github.com/atosatto/ansible-dockerswarm

Docker_swarm module - join_token parameter for ansible not working

💖 💪 🙅 🚩
xnok
Alexandre Couedelo

Posted on February 24, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related