Autoprovisioning NFS volumes in EKS with CDK

memark

Magnus Markling

Posted on July 17, 2022

Autoprovisioning NFS volumes in EKS with CDK

Table of contents

  1. Introduction
  2. Setting up CDK
  3. Setting up an EKS cluster
  4. Binding a PVC to EBS
  5. Setting up EFS autoprovision
  6. Binding a PVC to EFS
  7. Cleaning up
  8. Summary

Introduction

I recently set up an EKS cluster for a client. From the cluster we wanted to autoprovision EFS volumes based on persistent volume claims. We wanted to do all of this using CDK.

I assume you're already somewhat familiar with CDK, Kubernetes, EKS and have your ~/.aws/credentials configured.

Setting up CDK

Let's get started!

Create an empty working directory:

mkdir eks-efs && cd eks-efs

Then create a basic CDK TypeScript app:

npx cdk@^2.32 init app --language typescript

If this is the first time you use CDK on this AWS account, you need to bootstrap it (= creating some S3 buckets etc) by running:

npx cdk bootstrap

Next let's add a variable to keep the naming prefix used with all our resources. (If you already have similarly named resources, change it to something else.) Add this to the top of the file, after the imports:

// lib/eks-efs-stack.ts
...
const prefix = "my";
...
Enter fullscreen mode Exit fullscreen mode

Setting up an EKS cluster

Add this (and all code snippets following this one) to the end of the constructor:

// lib/eks-efs-stack.ts
...
    const vpc = new aws_ec2.Vpc(this, `${prefix}-vpc`, {
      vpcName: `${prefix}-vpc`,
    });

    const cluster = new aws_eks.Cluster(this, `${prefix}-eks`, {
      clusterName: `${prefix}-eks`,
      version: aws_eks.KubernetesVersion.V1_21,
      vpc,
    });
...
Enter fullscreen mode Exit fullscreen mode

The explicit VPC is strictly not needed, but we will use it later.

Now deploy the cluster by running:

npx cdk@^2.32 deploy

Creating all the required resources usually takes around 20-25 minutes.

The result should look something like this (I have removed sensitive information, such as my AWS account number):

 āœ…  my-stack

āœØ  Deployment time: 1373.08s

Outputs:
my-stack.myeksConfigCommand76382EC1 = aws eks update-kubeconfig --name my-eks --region eu-west-1 --role-arn arn:aws:iam::...:role/my-stack-myeksMastersRole...-...
my-stack.myeksGetTokenCommand0DD2F5A8 = aws eks get-token --cluster-name my-eks --region eu-west-1 --role-arn arn:aws:iam::...:role/my-stack-myeksMastersRole...-...
Stack ARN:
arn:aws:cloudformation:eu-west-1:...:stack/my-stack/bbb03590-05cf-11ed-808a-02bd3ce8aca3
Enter fullscreen mode Exit fullscreen mode

Configuring kubectl

Now that we have our cluster up and running, we can administer it in the usual way using kubectl. First we need to add our cluster to our kubeconfig by copying the ConfigCommand output value from the previous step. I also recommend adding an --alias flag, as the context name will otherwise be the not-so-memorable cluster ARN.

In my case, I run:

aws eks update-kubeconfig --name my-eks --region eu-west-1 --role-arn arn:aws:iam::...:role/my-stack-myeksMastersRoleD2A59038-47WE3AI9OHS3 --alias my-eks

Followed by some simple verification, e.g.:

kubectl get pod -A

And get:

NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
kube-system   aws-node-nmwzs             1/1     Running   0          36m
kube-system   aws-node-r2hjk             1/1     Running   0          36m
kube-system   coredns-7cc879f8db-8ntfk   1/1     Running   0          42m
kube-system   coredns-7cc879f8db-dsrs9   1/1     Running   0          42m
kube-system   kube-proxy-247kc           1/1     Running   0          36m
kube-system   kube-proxy-6g45p           1/1     Running   0          36m
Enter fullscreen mode Exit fullscreen mode

Supported Kubernetes versions

You might have noticed that I chose to use Kubernetes 1.21 here, which is pretty old. At the time of writing this is unfortunately the newest version of Kubernetes available with CDK. (When using the console, you can go up to 1.22.) Even though 1.21 is no longer supported by the Kubernetes project, I recently learned that it's still supported by AWS themselves, meaning they do backports of security fixes.

Binding a PVC to EBS

EBS autoprovisioning support is already built into EKS, via a storage class called gp2.

If we create a persistent volume claim (PVC):

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs-rwo
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: gp2
  resources:
    requests:
      storage: 1Gi
EOF
Enter fullscreen mode Exit fullscreen mode

And a pod that uses the PVC:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: ebs-rwo
spec:
  containers:
    - name: nginx
      image: nginx
  volumes:
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: ebs-rwo
EOF
Enter fullscreen mode Exit fullscreen mode

Wait a few seconds, then check the status of the new PVC:

kubectl get pvc

NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS
ebs-rwo   Bound    pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1   1Gi        RWO            gp2
Enter fullscreen mode Exit fullscreen mode

And the associated newly created persistent volume (PV):

kubectl get pv

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS
pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1   1Gi        RWO            Delete           Bound    default/ebs-rwo   gp2
Enter fullscreen mode Exit fullscreen mode

However, there are some reasons why we won't want to use EBS for persistent storage. One of them is that EBS doesn't support access mode ReadWriteMany. EFS to the rescue!

Setting up EFS autoprovision

However, EFS is not supported out-of-the-box as EBS is, and needs some setting up to work as smoothly. The are complete instructions on the AWS website, but they use imperative CLI commands that doesn't rhyme very well with a modern Infrastructure-as-Code (IaC) approach.

Adding an EFS filesystem

First we need to create a network security group:

// lib/eks-efs-stack.ts
...
    const efsInboundSecurityGroup = new aws_ec2.SecurityGroup(
      this,
      `${prefix}-efs-inbound-security-group`,
      {
        securityGroupName: `${prefix}-efs-inbound`,
        vpc,
      },
    );
...
Enter fullscreen mode Exit fullscreen mode

With an ingress rule that allows for incoming EFS traffic:

// lib/eks-efs-stack.ts
...
    new aws_ec2.CfnSecurityGroupIngress(
      this,
      `${prefix}-efs-inbound-security-group-ingress`,
      {
        ipProtocol: "tcp",
        cidrIp: vpc.vpcCidrBlock,
        groupId: efsInboundSecurityGroup.securityGroupId,
        description: "Inbound EFS",
        fromPort: 2049,
        toPort: 2049,
      },
    );
...
Enter fullscreen mode Exit fullscreen mode

Port 2049 is the standard EFS port. (fromPort and toPort simply defines the port range to be from 2049 to 2049, not to be confused with source and destination ports in the world of TCP.)

Then we create the EFS file system:

// lib/eks-efs-stack.ts
...
    const efs_fs = new aws_efs.FileSystem(this, `${prefix}-efs`, {
      fileSystemName: `${prefix}`,
      vpc,
      securityGroup: efsInboundSecurityGroup,
      removalPolicy: RemovalPolicy.DESTROY,
    });
...
Enter fullscreen mode Exit fullscreen mode

We specify RemovalPolicy.DESTROY to make sure the filesystem is deleted together with the rest of the CDK stack.

Adding an IAM role

First we create a policy that gives the necessary permissions required to view and create EFS access points:

// lib/eks-efs-stack.ts
...
    const efsAllowPolicyDocument = aws_iam.PolicyDocument.fromJson({
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "elasticfilesystem:DescribeAccessPoints",
            "elasticfilesystem:DescribeFileSystems",
          ],
          "Resource": "*",
        },
        {
          "Effect": "Allow",
          "Action": [
            "elasticfilesystem:CreateAccessPoint",
          ],
          "Resource": "*",
          "Condition": {
            "StringLike": {
              "aws:RequestTag/efs.csi.aws.com/cluster": "true",
            },
          },
        },
        {
          "Effect": "Allow",
          "Action": "elasticfilesystem:DeleteAccessPoint",
          "Resource": "*",
          "Condition": {
            "StringEquals": {
              "aws:ResourceTag/efs.csi.aws.com/cluster": "true",
            },
          },
        },
      ],
    });
...
Enter fullscreen mode Exit fullscreen mode

Next we specify who can assume this role (what's called Trust relationships in the AWS console):

// lib/eks-efs-stack.ts
...
    const efsAssumeRolePolicyDocument = aws_iam.PolicyDocument.fromJson({
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": cluster.openIdConnectProvider.openIdConnectProviderArn,
          },
          "Action": "sts:AssumeRoleWithWebIdentity",
          "Condition": {
            "StringEquals": new CfnJson(
              this,
              `${prefix}-assume-role-policy-document-string-equals-value`,
              {
                value: {
                  [`${cluster.openIdConnectProvider.openIdConnectProviderIssuer}:aud`]:
                    "sts.amazonaws.com",
                  [`${cluster.openIdConnectProvider.openIdConnectProviderIssuer}:sub`]:
                    "system:serviceaccount:kube-system:efs-csi-controller-sa",
                },
              },
            ),
          },
        },
      ],
    });
...
Enter fullscreen mode Exit fullscreen mode

We reference the IODC provider created together with the cluster. We also use a special magic CfnJson construct to be able to use dynamic keys in the JSON. (This took quite a while to figure out.)

Finally we tie them all together by creating the IAM role:

// lib/eks-efs-stack.ts
...
    const efsRole = new aws_iam.CfnRole(
      this,
      `${prefix}-efs-csi-controller-sa-role`,
      {
        roleName: `${prefix}-efs-csi-controller-sa-role`,
        assumeRolePolicyDocument: efsAssumeRolePolicyDocument,
        policies: [
          {
            policyName: `${prefix}-efs-csi-controller-sa-policy`,
            policyDocument: efsAllowPolicyDocument,
          },
        ],
      },
    );
...
Enter fullscreen mode Exit fullscreen mode

Installing the EFS CSI driver

Now we need to install the EFS CSI driver. This is a Kubernetes controller that checks for unbound PVC, and creates PV and EFS access points. The easiest way is via its Helm chart:

// lib/eks-efs-stack.ts
...
    cluster.addHelmChart(`${prefix}-aws-efs-csi-driver`, {
      repository: "https://kubernetes-sigs.github.io/aws-efs-csi-driver/",
      chart: "aws-efs-csi-driver",
      version: "2.2.7",
      release: "aws-efs-csi-driver",
      namespace: "kube-system",
      values: {
        controller: {
          serviceAccount: {
            annotations: {
              "eks.amazonaws.com/role-arn": efsRole.attrArn,
            },
          },
        },
      },
    });
...
Enter fullscreen mode Exit fullscreen mode

eks.amazonaws.com/role-arn is a special annotation recognized by EKS, that associates that service account with the specified IAM role. This means all pods/containers running under that service account can call AWS APIs as that role. (Pretty much the same way you would associate an IAM role with an EC2 instance.)

Adding the storage class

The last thing we need to do is add a Kubernetes storage class (SC) for our PVCs to use:

// lib/eks-efs-stack.ts
...
    cluster.addManifest(`${prefix}-k8s-efs-storageclass`, {
      apiVersion: "storage.k8s.io/v1",
      kind: "StorageClass",
      metadata: {
        name: "efs",
      },
      provisioner: "efs.csi.aws.com",
      parameters: {
        directoryPerms: "700",
        provisioningMode: "efs-ap",
        fileSystemId: efs_fs.fileSystemId,
        uid: "1001",
        gid: "1001",
      },
    });
...
Enter fullscreen mode Exit fullscreen mode

Notice how the SC references our newly set-up EFS filesystem, as well as specifies some Linux related parameters.

Binding a PVC to EFS

Let's add a PVC which uses the SC:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-rwx
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs
  resources:
    requests:
      storage: 1Gi
`EOF`
Enter fullscreen mode Exit fullscreen mode

And a pod that uses the PVC:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: efs-rwx
spec:
  containers:
    - name: nginx
      image: nginx
  volumes:
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: efs-rwx
`EOF`
Enter fullscreen mode Exit fullscreen mode

Now check the status of the new PVC:

kubectl get pvc

NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS
ebs-rwo   Bound    pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1   1Gi        RWO            gp2
efs-rwx   Bound    pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b   1Gi        RWX            efs
Enter fullscreen mode Exit fullscreen mode

And the associated newly created PV:

kubectl get pv

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS
pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1   1Gi        RWO            Delete           Bound    default/ebs-rwo   gp2
pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b   1Gi        RWX            Delete           Bound    default/efs-rwx   efs
Enter fullscreen mode Exit fullscreen mode

Notice that the new one is bound, uses SC efs and has access mode RWX (ReadWriteMany).

We can also view the EFS access point that was created:

aws efs describe-access-points

AccessPoints:
- AccessPointArn: arn:aws:elasticfilesystem:eu-west-1:...:access-point/fsap-082fd6323a789739d
  AccessPointId: fsap-082fd6323a789739d
  ClientToken: pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b
  FileSystemId: fs-0a724fe641cbd9da7
  LifeCycleState: available
  OwnerId: '...'
  PosixUser:
    Gid: 1001
    Uid: 1001
  RootDirectory:
    CreationInfo:
      OwnerGid: 1001
      OwnerUid: 1001
      Permissions: '700'
    Path: /pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b
  Tags:
  - Key: efs.csi.aws.com/cluster
    Value: 'true'
Enter fullscreen mode Exit fullscreen mode

Cleaning up

In order to avoid further costs, let's clean up the resources we just created. This is as simple as:

npx cdk destroy

Again, this will take around 20-25 minutes.

Summary

That's it! We set up autoprovisioning of NFS volumes in EKS using CDK. Full working code will soon be available on GitHub.

Any comments or questions are welcome!

šŸ’– šŸ’Ŗ šŸ™… šŸš©
memark
Magnus Markling

Posted on July 17, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related