Clusterception Part 2: Initial Azure setup

eliasvakkuri

Elias Vakkuri

Posted on December 4, 2022

Clusterception Part 2: Initial Azure setup

This post is part of a series on running Kafka on Kubernetes on Azure. You can find links to other posts in the series here. All code is available in my Github.

In part 2 of the series, I will set up a VNET, Azure Container Registry, and Azure Kubernetes Service (AKS). The end goal is to set up the infrastructure so that we're ready to start deploying Kafka in the next part.


Overview

For AKS, there are a LOT of different settings that you can tweak when provisioning your cluster - look at the length of the resource format definition. Especially in networking there are many options.

To keep this post to a reasonable length, I will focus on AKS, and the following topics:

  • Securing access to AKS's control plane APIs (if there's one thing you secure, make it this one)
  • Integration between AKS and Container Registry
  • Expandability for future services

As I am creating a development cluster, we do not require production-grade settings for sensitive data. I'll look at configurations that are simple to implement and use and simultaneously increase your cluster's security.

By the end, my architecture will look like this (diagram created with draw.io and Azure stencil):

Architecture diagram, created with Draw.io and Azure stencil

I have drawn AKS control plane outside of the VNET, as the control plane's nodes usually live in a separate VNET (and subscription) managed by Microsoft. There is in preview an option to deploy also the control plane to your VNET, but I don't see a reason for that here.

Using Bicep

I will set up the infrastructure in Bicep language. I mainly use Terraform in projects, but Bicep looks like a good alternative for Azure-only setups. Over the older ARM templates, Bicep has a number of benefits:

  • Easier to read and write: First and foremost, Bicep is much easier to read and write than ARM templates. I foresee much less wasted time figuring out where a curly brace is missing when validation won't pass. Also, the autocomplete and suggestions in VS Code work great.
  • Decorators: You can use parameter decorators as one easy way to document your solution and add validations. In the full templates, I use decorators to add a description and to set a list of allowed values.
  • Named subresources: We create the subnets as part of the VNET declaration, but then we immediately retrieve the subnet resources using the existing keyword. This way, we can refer to the subnet directly, as we'll see when creating the AKS cluster.
  • Simpler modules: When using ARM templates, you need to deploy the linked templates somewhere Azure can reach them. With Bicep, Azure CLI combines all the modules into 1 ARM template, so it's enough to have all the templates locally available.

One big drawback of Bicep against Terraform is that you can only manage Azure resources with Bicep. You need to manage Azure AD objects like users, groups, or service principals in another way, like directly with Azure CLI. Separating Azure and Azure AD feels like a weird division between services, but I guess Microsoft has its reasons. 🙂

Code

I will not post the full templates in this post, as you can find them in my Github. Instead, I'll post smaller snippets below relevant to my topics.

I'll deploy AKS and closely related resources via a separate template. I then call the AKS template from my main template as a module. I also output AKSs identities from the module and use those to assign relevant accesses to our Container Registry. I'll explain this more closely further down.

Let's look at the most relevant pieces of the AKS template.

Linking the Container Registry

The container images I will run in AKS need to come from somewhere. Public images might come from Docker Hub, for example, but private images are usually stored in Azure Container Registry. This connection will need to be set up for AKS to work.

AKS pods and deployments use a kubelet identity to authenticate to container registries. When creating an AKS cluster via Azure CLI, there is the option --attach-acr - this deploys a user-assigned managed identity, assigns it as the kubelet identity in AKS, and gives it the AcrPull role in the Container Registry. The managed identity is created in the cluster's resource group, so users might not have access to the actual resource.

With Bicep, I'll need to manually create and assign the kubelet identity. In addition, the cluster's control plane identity needs specific RBAC rights to manage the kubelet identity. Therefore, I'll use user-assigned managed identities for both. I set the cluster identity in identity and kubelet identity in properties.identityProfile.

Finally, I'll output the identities so I can use them for role assignments in the main template.

Networking for services

AKS has two main networking modes: Kubenet and Azure CNI. On a very high level, with Kubenet, only nodes get IPs from your subnet, and for pods, some address translation magic happens. With Azure CNI, both nodes and pods get assigned IPs. There are benefits to pods having IPs, like more transparent networking, but it also means that you will burn through your IP ranges much faster and require more planning with the networks you use.

Previously only Azure CNI supported using a custom VNET, and Kubenet was only suggested for development and testing environments. Nowadays, also Kubenet supports custom VNETs and based on the documentation, both are fine for production deployments.

Which to choose? It depends on your specific circumstances. I'm not an expert in the area, so I'll not go into too much detail now. I don't have to worry too much about IP ranges for this blog, so I'll go with Azure CNI.

The only thing we need to configure is that the AKS internal IP ranges don't overlap with the IP ranges in our VNET. The actual values don't matter in our case; I just picked some I saw in the documentation.

We'll use dynamic allocation of IPs for pods. With this option, Azure deploys nodes and pods to separate subnets, and pod IPs are assigned dynamically from their subnet. This has several benefits, as outlined in the documentation. In Bicep, I only need to create a separate subnet and assign it to pods in the agentPoolProfiles section.

Finally, I'll set publicNetworkAccess as Enabled, as I want to reach the cluster and its services from my laptop. As a side note, there are also separate settings to create a private cluster. I'm not entirely sure how these settings relate to one another - I might investigate this more in a future post.

Authentication: aadProfile and disableLocalAccounts

As mentioned earlier, we want to limit access to AKS's control plane. AKS has a pretty nice-looking integration with Azure AD nowadays, so I'll use that for both authentication and authorization for admin operations.

I enable Azure AD for control plane operations in aadProfile.enableAzureRBAC and disable local Kubernetes admin accounts with disableLocalAccounts. This way, control plane operations are authorized via Azure AD, simplifying maintenance.

Setting aadProfile.managed is related to how AKS links with Azure AD internally. In terms of using the cluster, it shouldn't matter. However, managed is the newer setting, so I'll set it on.

Finally, in aadProfile.adminGroupObjectIDs, we assign an admin group for the cluster. We provide the object ID of the group as a parameter. You can achieve the same result by assigning the "Azure Kubernetes Service RBAC Cluster Admin" role to any Azure AD identity you wish.

apiServerAccessProfile

I'll set some IP range restrictions for API server access so that kubectl commands can only be run from these IP ranges. I provide them as parameters to the main profile and then pass them to the AKS module.

I could also disable access to the control plane from public networks altogether, which would be preferable for production deployments. However, for this series, we're not dealing with anything sensitive, plus I don't want the hassle of creating jump machines or VPN connections. As such, I consider Azure AD authentication plus IP range restriction a good combination.

autoUpgradeProfile

Here, I set automatic upgrades for the Kubernetes version. AKS supports specifying the desired level of automatic version upgrades - I'll go with the suggested "patch" level.

Deploying AKS as a module

Finally, as mentioned previously, I deploy AKS and related resources as a submodule and assign access rights to the Container Registry for the kubelet identity.

What I didn't do

There's a lot more that you could do by tweaking the properties. One thing that might be a good addition would be to enable SSH access to the cluster nodes for maintenance scenarios. However, this also requires securing network access properly, which goes off this post's focus. We'll revisit the topic later on if needed.

Also, additional security options are available, like Microsoft Defender for Cloud. This sounds like a good idea for production deployments, but I don't see it as necessary for this post.


Testing

Let's deploy a service to our cluster to verify that everything is running as expected. We'll follow the Microsoft tutorial.

First, let's login to the cluster API server with kubelogin and check connectivity:

az aks get-credentials \
  --resource-group clusterception \
  --name clusterceptionaks

KUBECONFIG=<path to kubeconfig> kubelogin convert-kubeconfig

kubectl get services
Enter fullscreen mode Exit fullscreen mode

With that working correctly, let's log in to our Container Registry and push the tutorial frontend image there. This way, we can test connectivity between the cluster and the registry.

az acr login --name clusterceptionacr

docker pull mcr.microsoft.com/azuredocs/azure-vote-front:v1

docker tag \
  mcr.microsoft.com/azuredocs/azure-vote-front:v1 \
  clusterceptionacr.azurecr.io/azure-vote-front:v1

docker push clusterceptionacr.azurecr.io/azure-vote-front:v1
Enter fullscreen mode Exit fullscreen mode

Finally, let's apply the YAML with the Deployment and Service definitions and using the image in our Container Registry as explained in the tutorial instructions.

kubectl apply -f sample-app/azure-vote.yaml
Enter fullscreen mode Exit fullscreen mode

You can get the external IP of the frontend service by running the following:

kubectl get service azure-vote-front
Enter fullscreen mode Exit fullscreen mode

If this command does not return an IP, wait for a few minutes, then try again. Once you have the IP, navigate there, and the voting app front should greet you. Great stuff!


Closing words

What a long article! This really goes to show the depth of configuration options in AKS. Microsoft has done much work to simplify the setup, but for long-term operation and production deployments, you often need a dedicated team that understands all the knobs and levers. This gap in the required level of knowledge is one of the reasons why I usually prefer PaaS services.

In any case, I hope you got something out of this post! Please join me for the next part when we deploy Kafka to our AKS cluster.

💖 💪 🙅 🚩
eliasvakkuri
Elias Vakkuri

Posted on December 4, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Clusterception Part 2: Initial Azure setup
kubernetes Clusterception Part 2: Initial Azure setup

December 4, 2022