How to implement a Mesh Network on AWS

It was sometime ago, that i was working in a complex Greenfield project.

We had to design a secure infrastructure (in many aspects), make sure that all traffic was encrypted at Rest and in Transit and deploy a large number of services in AWS. While the Dev teams were working on building the applications, i was focusing on those requirements.

The main requirement, was to design and implement a flat Mesh Network on AWS (with encrypted traffic). All servers deployed, should have a point-to-point connection to every other peer in the network. On top of that, some servers hosted on Azure and GCP should be able to join the Mesh. And to add more complexity, external clients like Laptops or Mobile Phones should be able to access securely, specific servers/services in the Mesh.

After gathering all the information and speaking with a number of people to make sure that all requirements were properly documented and added to the backlog, it was time to start working for the PoC (Proof of Concept)

One of the tools that seemed like the right candidate was Wireguard.

What is Wireguard

WireGuard is a secure network tunnel, operating at layer 3, implemented as a kernel virtual network interface for Linux. It utilizes state-of-the-art cryptography and it aims to be faster, simpler, leaner, and more useful than IPsec. Wireguard is currently under heavy development, but already it might be regarded as the most secure, easiest to use, and simplest VPN solution in the industry. That’s why a lot of VPN service providers have started to using it..
Source: wireguard.com

After reading the documentation and running some tests, i decided to proceed with that. The process of setting up a Wireguard as a VPN is straight forward. You have to install it, generate the required keys, create a wg0.conf file for each server, configure the relevant Security groups on AWS and your VPN is up and running quickly.

But in our case, we had to build a Mesh consisting of 100s of servers, most of them being part of AutoScaling groups. As a result, we didn’t have the option to configure wg0.conf manually, every time a server had to join the mesh.

How Wireguard works

WireGuard works by encrypting the connection using a pair of cryptographic keys, each server needs to have it’s own private and public keys and then exchange public keys with the rest.

The wg0.conf file contains all the necessary configuration parameters for the WireGuard interface

Here are some of the main parameters that can be configured in wg0.conf:

PrivateKey: This parameter defines the private key for the WireGuard interface. It is used to authenticate and encrypt traffic between peers.
ListenPort: This parameter defines the port that WireGuard will listen on for incoming connections ( default is UDP 51820).
Address: This parameter defines the IP address and subnet mask for the WireGuard interface.
Peer: This parameter defines the configuration for a peer on the WireGuard network. It includes the public key of the peer, its IP address, allowed IPs (the IP ranges that the peer can access), and other options such as endpoint configurations.

Mesh Solution Overview

I started working on my Spike and it was a challenge to find the best way to implement such a Mesh topology. At first i deployed a number of EC2 instances, by using Terraform, in multiple AWS regions. After that i was going through my list and started building and trying things:

Terraform and Ansible: Successfully created a Mesh, but it was really difficult to manage any new peers and auto update the wg0.conf when they joined. Came to the conclusion that it was fine for static setups but not for dynamic.
Terraform, Hashicorp Vault and a ton of bash scripts: That looked promising, let’s see how it works. When connecting nodes via wireguard, each node has to know the public key and endpoint ip of all peers. In this scenario, nodes with proper authentication in Vault were allowed to publish their own data and also to read connection data from other peers. They could all read the meeting point data for our mesh (data structure containing basic information about our mesh network), publish their own configuration to vault, query vault for other nodes known to the meeting point and add a wireguard peer for each of them. Although it worked, it was really complex to support it and troubleshoot, especially after the handover.

Then i came across a tool called Netmaker. It was at the early stages of development but looked really promising (Since then, i have tested all versions, including the current one 0.18.5 that was released a few days ago, with big improvements and fixes).

What is Netmaker

Netmaker is a platform for creating fast and secure virtual networks with WireGuard. It is a tool for creating and managing virtual overlay networks. If you have at least two machines with internet access that you need to connect with a secure tunnel or thousands of servers spread across multiple locations or cloud providers, Netmaker is the perfect “tool”. It connects machines securely, wherever they are.
Source: Netmaker.org

Now, after this intro, let’s see how we can create a secure Mesh Network on AWS using Netmaker and Wireguard.

How to Install Netmaker

Start by Launching a VM with Ubuntu 20.04 or latest with a public IP. (Ubuntu is the one currently supported)
Open ports 443, 80, and 51821-51830 (UDP) on the security group. You can make this range smaller, but keep in mind that you need have a port for each network you create. (I ‘am going to explain more about Networks later)
Run the following script:



sudo wget -qO /root/nm-quick-interactive.sh https://raw.githubusercontent.com/gravitl/netmaker/master/scripts/nm-quick-interactive.sh && sudo chmod +x /root/nm-quick-interactive.sh && sudo /root/nm-quick-interactive.sh

You need to answer a number of simple questions and at the the end you are going to presented with the login URL.

After typing the URL, you are going to be asked to create a username and password and when you login this is what you are going to see.

Create a network

The first thing we have to do afterwards, is to create a Network and enter the IP ranges that our servers would use for secure cross-communication. (Wireguard interface wg0, is going to use an IP address for that range)

Click the ‘Networks’ tile on the dashboard, or in the left navigation panel click ‘Networks’.

On the Networks screen, click on the ‘Create Network’ button.

Give you network a name, and then enter your preferred CIDR. Or click on the ‘Autofill’ button and then change the name and the CIDR generated by the autofill option.

Create the Keys

Then proceed by creating the required keys. When done, we can see that there multiple ways to add a peer to our Mesh Network

Create and configure the Nodes

Most of the hard work is done. And now it’s time to launch a few instances in AWS in multiple regions and spread them across Public and Private subnets. In our case almost all instances are in Private subnets, with the exception of Netmaker server and Azure instance.

I like to use Terraform with Gitlab Runners for my test deployments and for this demo i had about 10 EC2 instances up and running really fast (Was using spot instances to minimise costs). Just remember that you need to deploy a standalone (on-demand) EC2 instance for Netmaker.

All the the Security Groups, for the Nodes, were configured to allow incoming traffic (UDP) to ports 51820–51830.

And with with the help of User Data and the command shown below, we can configure the nodes to join the Mesh during the launch process.
(Need to replace eyJzZXJxxxyxxxxxxxxxxxxxxxxcccccccccccvvvvvvv0000000== with your token)



#!/bin/bash

sudo curl -Lo /etc/yum.repos.d/wireguard.repo https://copr.fedorainfracloud.org/coprs/jdoss/wireguard/repo/epel-7/jdoss-wireguard-epel-7.repo
sudo yum install epel-release
sudo amazon-linux-extras install -y epel && yum install -y wireguard-dkms wireguard-tools
curl -sL 'https://rpm.netmaker.org/gpg.key' | sudo tee /tmp/gpg.key
curl -sL 'https://rpm.netmaker.org/netclient-repo' | sudo tee /etc/yum.repos.d/netclient.repo
sudo rpm --import /tmp/gpg.key
sudo yum check-update
sudo yum install -y netclient
netclient register -t eyJzZXJxxxyxxxxxxxxxxxxxxxxcccccccccccvvvvvvv0000000==

After a few minutes we have our instances up and running, fully configured with Wireguard and Netclient (All of the them have automatically joined our Mesh network).

Now let’s launch one more server but this time in… Azure

Time to check our Netmaker GUI and make sure that all nodes have joined. If they don’t show immediately, there is no need to worry. It could take up to 5 mins to show up. In our case all Nodes are now visible with a Healthy status.

At this point we have successfully deployed and configured a flat Mesh network, not only between AWS instances but also with a server in a different cloud provider. All traffic between them is encrypted in transit, by using Wireguard.

Mesh Graph / Visualisation

Let’s see how our Mesh Network looks like at this stage

Wireguard Mesh

Netmaker server used as Ingress Node

Test our Mesh Network

How about running some tests to confirm that everything is working as expected?

Access Control Lists

By default, Netmaker creates a “full mesh,” meaning every node in our network can talk to every other node. But there is a nice feature that you can use in order to enable/disable any peer-to-peer connection in the network.

The ACL feature can be accessed by either clicking on “ACLs” in the sidebar, or by clicking on a Node in the Node List.

Add External Clients

There are cases that external clients need to access some services running in the nodes. That can be a Mobile phone, a laptop/tablet or an IoT device.

We can achieve that by creating an Ingress. (And once connected to the Ingress, we can reach all servers in the network.)

The next step is to generate the client configs. Clients can then join our mesh, either by scanning a QR code or by importing the Wireguard config (Please note, that Wireguard client must be installed in the mobile, laptop etc)

In our case i have download the config in my laptop and have connected using the Wireguard client.

For this demo, i have installed Apache in an AWS EC2 instance and in an Azure VM. As you can see, i can access both from my laptop, through a secure tunnel, using the 10.141.x.x IPs ( Mesh network CIDR)

Apache running on AWS EC2

Apache running on Azure instance

Conclusion

This is just a use case of using Netmaker and Wireguard to create a secure Mesh Network on AWS. There are more as you can see below and we are going to discuss some of them in future posts.

Automate the creation of a large WireGuard-based (Mesh) network
Secure access to a home or office network
Provide remote access to resources like an AWS VPC, or K8S cluster
Create clusters that span environments
Remotely access a cluster from an external source
Remotely access an external source from a cluster
Manage a secure mesh of IoT devices

Hope you found this post useful. Feel free to reach to me for any questions.

Useful links:
Wireguard
Netmaker

Blog