Autoprovisioning NFS volumes in EKS with CDK
Magnus Markling
Posted on July 17, 2022
Table of contents
- Introduction
- Setting up CDK
- Setting up an EKS cluster
- Binding a PVC to EBS
- Setting up EFS autoprovision
- Binding a PVC to EFS
- Cleaning up
- Summary
Introduction
I recently set up an EKS cluster for a client. From the cluster we wanted to autoprovision EFS volumes based on persistent volume claims. We wanted to do all of this using CDK.
I assume you're already somewhat familiar with CDK, Kubernetes, EKS and have your ~/.aws/credentials configured.
Setting up CDK
Let's get started!
Create an empty working directory:
mkdir eks-efs && cd eks-efs
Then create a basic CDK TypeScript app:
npx cdk@^2.32 init app --language typescript
If this is the first time you use CDK on this AWS account, you need to bootstrap it (= creating some S3 buckets etc) by running:
npx cdk bootstrap
Next let's add a variable to keep the naming prefix used with all our resources. (If you already have similarly named resources, change it to something else.) Add this to the top of the file, after the imports:
// lib/eks-efs-stack.ts
...
const prefix = "my";
...
Setting up an EKS cluster
Add this (and all code snippets following this one) to the end of the constructor:
// lib/eks-efs-stack.ts
...
const vpc = new aws_ec2.Vpc(this, `${prefix}-vpc`, {
vpcName: `${prefix}-vpc`,
});
const cluster = new aws_eks.Cluster(this, `${prefix}-eks`, {
clusterName: `${prefix}-eks`,
version: aws_eks.KubernetesVersion.V1_21,
vpc,
});
...
The explicit VPC is strictly not needed, but we will use it later.
Now deploy the cluster by running:
npx cdk@^2.32 deploy
Creating all the required resources usually takes around 20-25 minutes.
The result should look something like this (I have removed sensitive information, such as my AWS account number):
ā
my-stack
āØ Deployment time: 1373.08s
Outputs:
my-stack.myeksConfigCommand76382EC1 = aws eks update-kubeconfig --name my-eks --region eu-west-1 --role-arn arn:aws:iam::...:role/my-stack-myeksMastersRole...-...
my-stack.myeksGetTokenCommand0DD2F5A8 = aws eks get-token --cluster-name my-eks --region eu-west-1 --role-arn arn:aws:iam::...:role/my-stack-myeksMastersRole...-...
Stack ARN:
arn:aws:cloudformation:eu-west-1:...:stack/my-stack/bbb03590-05cf-11ed-808a-02bd3ce8aca3
Configuring kubectl
Now that we have our cluster up and running, we can administer it in the usual way using kubectl. First we need to add our cluster to our kubeconfig by copying the ConfigCommand output value from the previous step. I also recommend adding an --alias flag, as the context name will otherwise be the not-so-memorable cluster ARN.
In my case, I run:
aws eks update-kubeconfig --name my-eks --region eu-west-1 --role-arn arn:aws:iam::...:role/my-stack-myeksMastersRoleD2A59038-47WE3AI9OHS3 --alias my-eks
Followed by some simple verification, e.g.:
kubectl get pod -A
And get:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system aws-node-nmwzs 1/1 Running 0 36m
kube-system aws-node-r2hjk 1/1 Running 0 36m
kube-system coredns-7cc879f8db-8ntfk 1/1 Running 0 42m
kube-system coredns-7cc879f8db-dsrs9 1/1 Running 0 42m
kube-system kube-proxy-247kc 1/1 Running 0 36m
kube-system kube-proxy-6g45p 1/1 Running 0 36m
Supported Kubernetes versions
You might have noticed that I chose to use Kubernetes 1.21 here, which is pretty old. At the time of writing this is unfortunately the newest version of Kubernetes available with CDK. (When using the console, you can go up to 1.22.) Even though 1.21 is no longer supported by the Kubernetes project, I recently learned that it's still supported by AWS themselves, meaning they do backports of security fixes.
Binding a PVC to EBS
EBS autoprovisioning support is already built into EKS, via a storage class called gp2.
If we create a persistent volume claim (PVC):
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ebs-rwo
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp2
resources:
requests:
storage: 1Gi
EOF
And a pod that uses the PVC:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: ebs-rwo
spec:
containers:
- name: nginx
image: nginx
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: ebs-rwo
EOF
Wait a few seconds, then check the status of the new PVC:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
ebs-rwo Bound pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1 1Gi RWO gp2
And the associated newly created persistent volume (PV):
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS
pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1 1Gi RWO Delete Bound default/ebs-rwo gp2
However, there are some reasons why we won't want to use EBS for persistent storage. One of them is that EBS doesn't support access mode ReadWriteMany. EFS to the rescue!
Setting up EFS autoprovision
However, EFS is not supported out-of-the-box as EBS is, and needs some setting up to work as smoothly. The are complete instructions on the AWS website, but they use imperative CLI commands that doesn't rhyme very well with a modern Infrastructure-as-Code (IaC) approach.
Adding an EFS filesystem
First we need to create a network security group:
// lib/eks-efs-stack.ts
...
const efsInboundSecurityGroup = new aws_ec2.SecurityGroup(
this,
`${prefix}-efs-inbound-security-group`,
{
securityGroupName: `${prefix}-efs-inbound`,
vpc,
},
);
...
With an ingress rule that allows for incoming EFS traffic:
// lib/eks-efs-stack.ts
...
new aws_ec2.CfnSecurityGroupIngress(
this,
`${prefix}-efs-inbound-security-group-ingress`,
{
ipProtocol: "tcp",
cidrIp: vpc.vpcCidrBlock,
groupId: efsInboundSecurityGroup.securityGroupId,
description: "Inbound EFS",
fromPort: 2049,
toPort: 2049,
},
);
...
Port 2049 is the standard EFS port. (fromPort and toPort simply defines the port range to be from 2049 to 2049, not to be confused with source and destination ports in the world of TCP.)
Then we create the EFS file system:
// lib/eks-efs-stack.ts
...
const efs_fs = new aws_efs.FileSystem(this, `${prefix}-efs`, {
fileSystemName: `${prefix}`,
vpc,
securityGroup: efsInboundSecurityGroup,
removalPolicy: RemovalPolicy.DESTROY,
});
...
We specify RemovalPolicy.DESTROY to make sure the filesystem is deleted together with the rest of the CDK stack.
Adding an IAM role
First we create a policy that gives the necessary permissions required to view and create EFS access points:
// lib/eks-efs-stack.ts
...
const efsAllowPolicyDocument = aws_iam.PolicyDocument.fromJson({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"elasticfilesystem:DescribeAccessPoints",
"elasticfilesystem:DescribeFileSystems",
],
"Resource": "*",
},
{
"Effect": "Allow",
"Action": [
"elasticfilesystem:CreateAccessPoint",
],
"Resource": "*",
"Condition": {
"StringLike": {
"aws:RequestTag/efs.csi.aws.com/cluster": "true",
},
},
},
{
"Effect": "Allow",
"Action": "elasticfilesystem:DeleteAccessPoint",
"Resource": "*",
"Condition": {
"StringEquals": {
"aws:ResourceTag/efs.csi.aws.com/cluster": "true",
},
},
},
],
});
...
Next we specify who can assume this role (what's called Trust relationships in the AWS console):
// lib/eks-efs-stack.ts
...
const efsAssumeRolePolicyDocument = aws_iam.PolicyDocument.fromJson({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": cluster.openIdConnectProvider.openIdConnectProviderArn,
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": new CfnJson(
this,
`${prefix}-assume-role-policy-document-string-equals-value`,
{
value: {
[`${cluster.openIdConnectProvider.openIdConnectProviderIssuer}:aud`]:
"sts.amazonaws.com",
[`${cluster.openIdConnectProvider.openIdConnectProviderIssuer}:sub`]:
"system:serviceaccount:kube-system:efs-csi-controller-sa",
},
},
),
},
},
],
});
...
We reference the IODC provider created together with the cluster. We also use a special magic CfnJson construct to be able to use dynamic keys in the JSON. (This took quite a while to figure out.)
Finally we tie them all together by creating the IAM role:
// lib/eks-efs-stack.ts
...
const efsRole = new aws_iam.CfnRole(
this,
`${prefix}-efs-csi-controller-sa-role`,
{
roleName: `${prefix}-efs-csi-controller-sa-role`,
assumeRolePolicyDocument: efsAssumeRolePolicyDocument,
policies: [
{
policyName: `${prefix}-efs-csi-controller-sa-policy`,
policyDocument: efsAllowPolicyDocument,
},
],
},
);
...
Installing the EFS CSI driver
Now we need to install the EFS CSI driver. This is a Kubernetes controller that checks for unbound PVC, and creates PV and EFS access points. The easiest way is via its Helm chart:
// lib/eks-efs-stack.ts
...
cluster.addHelmChart(`${prefix}-aws-efs-csi-driver`, {
repository: "https://kubernetes-sigs.github.io/aws-efs-csi-driver/",
chart: "aws-efs-csi-driver",
version: "2.2.7",
release: "aws-efs-csi-driver",
namespace: "kube-system",
values: {
controller: {
serviceAccount: {
annotations: {
"eks.amazonaws.com/role-arn": efsRole.attrArn,
},
},
},
},
});
...
eks.amazonaws.com/role-arn is a special annotation recognized by EKS, that associates that service account with the specified IAM role. This means all pods/containers running under that service account can call AWS APIs as that role. (Pretty much the same way you would associate an IAM role with an EC2 instance.)
Adding the storage class
The last thing we need to do is add a Kubernetes storage class (SC) for our PVCs to use:
// lib/eks-efs-stack.ts
...
cluster.addManifest(`${prefix}-k8s-efs-storageclass`, {
apiVersion: "storage.k8s.io/v1",
kind: "StorageClass",
metadata: {
name: "efs",
},
provisioner: "efs.csi.aws.com",
parameters: {
directoryPerms: "700",
provisioningMode: "efs-ap",
fileSystemId: efs_fs.fileSystemId,
uid: "1001",
gid: "1001",
},
});
...
Notice how the SC references our newly set-up EFS filesystem, as well as specifies some Linux related parameters.
Binding a PVC to EFS
Let's add a PVC which uses the SC:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-rwx
spec:
accessModes:
- ReadWriteMany
storageClassName: efs
resources:
requests:
storage: 1Gi
`EOF`
And a pod that uses the PVC:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: efs-rwx
spec:
containers:
- name: nginx
image: nginx
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: efs-rwx
`EOF`
Now check the status of the new PVC:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
ebs-rwo Bound pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1 1Gi RWO gp2
efs-rwx Bound pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b 1Gi RWX efs
And the associated newly created PV:
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS
pvc-71606839-b2d9-4f36-8e1c-942b8d7e38f1 1Gi RWO Delete Bound default/ebs-rwo gp2
pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b 1Gi RWX Delete Bound default/efs-rwx efs
Notice that the new one is bound, uses SC efs and has access mode RWX (ReadWriteMany).
We can also view the EFS access point that was created:
aws efs describe-access-points
AccessPoints:
- AccessPointArn: arn:aws:elasticfilesystem:eu-west-1:...:access-point/fsap-082fd6323a789739d
AccessPointId: fsap-082fd6323a789739d
ClientToken: pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b
FileSystemId: fs-0a724fe641cbd9da7
LifeCycleState: available
OwnerId: '...'
PosixUser:
Gid: 1001
Uid: 1001
RootDirectory:
CreationInfo:
OwnerGid: 1001
OwnerUid: 1001
Permissions: '700'
Path: /pvc-5743b62a-1e1a-4b0b-b930-921465fd9d9b
Tags:
- Key: efs.csi.aws.com/cluster
Value: 'true'
Cleaning up
In order to avoid further costs, let's clean up the resources we just created. This is as simple as:
npx cdk destroy
Again, this will take around 20-25 minutes.
Summary
That's it! We set up autoprovisioning of NFS volumes in EKS using CDK. Full working code will soon be available on GitHub.
Any comments or questions are welcome!
Posted on July 17, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.