Mount S3 Objects to Kubernetes Pods

antweiss

Ant(on) Weiss

Posted on January 31, 2022

Mount S3 Objects to Kubernetes Pods

This post describes how to mount an S3 bucket to all the nodes in an EKS cluster and make it available to pods as a hostPath volume. Yes, we're aware of the security implications of hostPath volumes, but in this case it's less of an issue - because the actual access is granted to the S3 bucket (not the host filesystem) and access permissions are provided per serviceAccount.

Goofys

We're using goofys as the mounting utility. It's a "high-performance, POSIX-ish Amazon S3 file system written in Go" based on FUSE (file system in user space) technology.

Daemonset

In order to provide the mount transparently we need to run a daemonset - so the mount is created on all nodes in the cluster.

The Dockerfile and the Helm Chart

We've built our own goofys Docker image based on Alpine Linux and a Helm chart that installs the DaemonSet.

The image is found on our Docker hub repo here: https://hub.docker.com/r/otomato/goofys

The Dockerfile and the Helm chart can be found here: https://github.com/otomato-gh/s3-mounter

S3 Access per ServiceAccount

The Helm chart currently assumes that S3 access is provided by using an IAM Role attached to a kubernetes serviceAccount. We may add API access keys support in the future if needed.

HowTo:

Here's how to set it all up:

1. OIDC Provider for EKS

Make sure you have an IAM OIDC identity provider for your cluster. If not - you can use the following commands (you'll need eksctl installed):

aws eks describe-cluster --name cluster_name --query "cluster.identity.oidc.issuer" --output text`
Enter fullscreen mode Exit fullscreen mode

Example output:

https://oidc.eks.region-code.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E

List the IAM OIDC providers in your account. Replace EXAMPLED539D4633E53DE1B716D3041E with the value returned from the previous command.

aws iam list-open-id-connect-providers | grep EXAMPLED539D4633E53DE1B716D3041E
Enter fullscreen mode Exit fullscreen mode

Example output

"Arn": "arn:aws:iam::111122223333:oidc-provider/oidc.eks.region-code.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"

If output is returned from the previous command, then you already have a provider for your cluster. If no output is returned, then you must create an IAM OIDC provider with the following command. Replace cluster_name with your own value.

eksctl utils associate-iam-oidc-provider --cluster cluster_name --approve
Enter fullscreen mode Exit fullscreen mode

2. Create a Managed Policy for Bucket Access

Create json file named policy.json with the appropriate policy definition. For example - the following code snippet creates a json file that allows full access to a bucket named my-kubernetes-bucket:

read -r -d '' MY_POLICY <<EOF
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*",
            ],
            "Resource": [
                "arn:aws:s3:::my-kubernetes-bucket"
            ]
        }
    ]
}
EOF
echo "${MY_POLICY}" > policy.json
Enter fullscreen mode Exit fullscreen mode

Create the managed policy by running:

aws iam create-policy --policy-name kubernetes-s3-access --policy-document file://policy.json
Enter fullscreen mode Exit fullscreen mode

Example output:

{
"Policy": {
"PolicyName": "kubernetes-s3-access",
"PolicyId": "ANPAS3DOMWSIX73USJOHK",
"Arn": "arn:aws:iam::04968064045764:policy/kubernetes-s3-access",

Note the policy ARN for the next step.

3. Create a Role for S3 Access

Set your AWS account ID to an environment variable with the following command:

ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
Enter fullscreen mode Exit fullscreen mode

Set your OIDC identity provider to an environment variable with the following command. Replace the example values with your own values:

OIDC_PROVIDER=$(aws eks describe-cluster --name cluster-name --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")
Enter fullscreen mode Exit fullscreen mode

Copy the following code block to your computer and replace the example values with your own values.

read -r -d '' TRUST_RELATIONSHIP <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:my-namespace:my-service-account"
        }
      }
    }
  ]
}
EOF
echo "${TRUST_RELATIONSHIP}" > trust.json
Enter fullscreen mode Exit fullscreen mode

Run the modified code block from the previous step to create a file named trust.json.

Run the following AWS CLI command to create the role:

aws iam create-role --role-name eks-otomounter-role --assume-role-policy-document file://trust.json --description "Mount s3 bucket to EKS"
Enter fullscreen mode Exit fullscreen mode

Run the following command to attach the IAM policy using the ARN created in the previous section to your role:

aws iam attach-role-policy --role-name eks-otomounter-role --policy-arn=IAM_POLICY_ARN
Enter fullscreen mode Exit fullscreen mode

4. Finally - Install the S3 Mounter!

  • Add the helm repo to your repo list:
helm repo add otomount https://otomato-gh.github.io/s3-mounter
Enter fullscreen mode Exit fullscreen mode
  • Inspect its arguments in values.yaml
helm show values otomount/s3-otomount
Enter fullscreen mode Exit fullscreen mode

The values you want to set are in the end:

bucketName: my-bucket
iamRoleARN: my-role
mountPath: /var/s3
hostPath: /mnt/s3data
Enter fullscreen mode Exit fullscreen mode
  • Install the chart by providing your own values:
helm upgrade --install s3-mounter otomount/s3-otomount   \ 
--namespace otomount --set bucketName=<your-bucket-name> \
--set iamRoleARN=<your-role-arn> --create-namespace
Enter fullscreen mode Exit fullscreen mode

This will use the default hostPath for the mount - i.e /mnt/s3data

5. Use the mounted S3 bucket in your Deployments.

Here's an example pod definition that provides its container the access to the mounted bucket:

apiVersion: v1
kind: Pod
metadata:
  name: sleeper
spec:
  containers:
  - command:
    - sleep
    - infinity
    image: ubuntu
    name: ubuntu
    volumeMounts:
    - mountPath: /mydata:shared
      name: s3data
  volumes:
  - hostPath:
      path: /mnt/s3data
    name: s3data
Enter fullscreen mode Exit fullscreen mode

Note the :shared - it's a mount propagation modifier in the mountPath field that allows this volume to be shared by multiple pods/containers on the same node.

And that's it! You can now access your bucket. If you've created the pod from our example - you can exec to verify:

kubectl exec sleeper -- ls /mydata
Enter fullscreen mode Exit fullscreen mode

Note: running this on your cluster will cost you a few additional $ for S3 API calls that goofys performs to maintain the mount. So remember to monitor your cloud costs. But you should do that anyway, right?

Happy delivering!

💖 💪 🙅 🚩
antweiss
Ant(on) Weiss

Posted on January 31, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related