MongoDB Backups with Kubernetes Jobs
Joseph D. Marhee
Posted on December 30, 2018
I recently began running RocketChat on Kubernetes--a key component of this deployment was a MongoDB replica set. A best practice for running MongoDB reliably is, both, replication and regular backups, and Kubernetes provides accessible interfaces for both approaches.
In my case, I wanted regular and one-off backup capability, and the Kubernetes Jobs
resource provided me a quick way to do this. I wanted my job pod to write out these dumpfiles to a persistent data store, so I first setup a PersistentVolume
and accompanying Claim
to that volume:
kind: PersistentVolume
apiVersion: v1
metadata:
name: mongo-dump-pv-volume
labels:
type: local
app: mongo-dump
spec:
storageClassName: manual
capacity:
storage: 50Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/mnt/kube-data/mongo-dumps"
type: DirectoryOrCreate
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mongo-dump-pv-claim
labels:
app: postgres
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50Gi
If you use a cloud provider like AWS or Azure, a PersistentVolume can provision block storage, so if your provider does make durability guarantees then your data volume is that much more persistent. The above example just uses hostPath
volumes, so will persist for the lifecycle of the host that path resides on.
Your Job, itself, will have a podspec, much like other Kubernetes resources like Deployments
where you make requests for resources, and it'll look like this:
apiVersion: batch/v1
kind: Job
metadata:
name: mongodb-backup
labels:
app: mongo-dump
spec:
backoffLimit: 5
activeDeadlineSeconds: 100
template:
spec:
containers:
- name: mongodump
image: mongo
command: ["mongodump","--host","mongo-service:27017","--db","your_db"]
volumeMounts:
- mountPath: dump
name: mongo-dumps
volumes:
- name: mongo-dumps
persistentVolumeClaim:
claimName: mongo-dump-pv-claim
restartPolicy: OnFailure
You'll see we're attaching the volume as we might normally, and then run the mongodump
command, which will write out to the mount path, dump
.
If you have authentication enabled on your Mongo service, or use a SaaS like MongoDB Atlas, you can use Secrets
like you might normally to pass through credentials being stored securely, not in the Job spec itself:
apiVersion: batch/v1
kind: Job
metadata:
name: mongodb-backup
labels:
app: mongo-dump
spec:
backoffLimit: 5
activeDeadlineSeconds: 100
template:
spec:
containers:
- name: mongodump
image: mongo
env:
- name: MONGO_CONN_STRING
valueFrom:
secretKeyRef:
name: mongo-auth
key: connstring
- name: MONGO_DB
value: "my_db"
command: ["mongodump","--host","$MONGO_CONN_STRING","--db","$MONGO_DB"]
volumeMounts:
- mountPath: dump
name: mongo-dumps
volumes:
- name: mongo-dumps
persistentVolumeClaim:
claimName: mongo-dump-pv-claim
restartPolicy: Never
After you apply this Job spec, you can monitor your progress:
kubectl get pods -l app=mongo-dump
then monitor the logs for that pod name:
kubectl logs $POD_NAME
If you see your job failing, you can use the restartPolicy
to define behavior in this area; for example, in the declaration above, it will not restart, but you can, for example, use OnFailure
to attempt a retry, and use other options available to to define retires, backoff timing, etc.
CronJobs are another type of Job supported in Kubernetes presently (since 1.7), and in the link above, you'll see that the spec is similar, but includes your typical Cron syntax for defining when you'd like the job run, and related behavior.
Posted on December 30, 2018
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.