Automatic Firestore Backups
Denis Anisimov
Posted on March 9, 2019
Cloud Firestore has an export/import functionality using CLI and REST APIs that allows you to make simple backups to a Cloud Storage bucket. It basically looks like this:
gcloud beta firestore export gs://BUCKET_NAME --project PROJECT_ID
If you are as clunky as me and prone to messing up with database you'd probably want to have it backed up somewhat often. But unfortunately, even with Firestore in GA, there is no managed automatic backups option yet. Here is what my solution looked like for a while:
Not very DRY, is it? Here is a better way using only Google Cloud Platform services and a tiny bit of coding.
Overall idea
- Cloud Scheduler issues a message to a Cloud PubSub topic periodically based on a cron schedule.
- Cloud Function is triggered by the message and executes a call to the Firestore REST API
- Firestore export API starts a long-running operation that saves a backup to a specified bucket
- Cloud Storage stores backups organized by timestamps
Prerequisites
This post assumes that you have your Firebase project set up, have gcloud
command-line tool installed and know your way around Cloud Console.
Cloud Scheduler
Cloud Scheduler is a very simple service by GCP, but nevertheless very useful and much awaited by the server-less crowds. With it it's finally possible to define simple cron-like jobs that can trigger HTTP functions or emit a PubSub message. For use with Cloud Functions a PubSub way is preferred, as HTTP jobs don't have any authentication.
The following command creates a PubSub job scheduled at midnight everyday. The message body is not used in our case, so can be arbitrary.
gcloud scheduler jobs create \
pubsub firestore-backup \
--schedule "0 0 * * *" \
--topic "firestore-backup-daily" \
--message-body "scheduled"
Once you execute this command your job will run at the specified time.
Cloud Function
The cloud function is triggered on PubSub message and makes a request to Firestore REST API. To make a request with proper authentication we need to have an OAuth access token. Luckily for us Cloud Functions have a default service account that can be easily used to generate a token.
import * as functions from 'firebase-functions';
import * as firebase from 'firebase-admin';
import * as request from 'request-promise';
firebase.initializeApp();
export const backupOnPubSub = functions.pubsub.topic('firestore-backup-daily').onPublish(async () => {
// Get bucket name from cloud functions config
// Should be in the format 'gs://BUCKET_NAME'
//
// Can be set with the following command:
// firebase functions:config:set backup.bucket="gs://BUCKET_NAME"
const bucket = functions.config().backup.bucket;
// Firebase/GCP project ID is available as an env variable
const projectId = process.env.GCLOUD_PROJECT;
console.info(`Exporting firestore database in project ${projectId} to bucket ${bucket}`);
// Use default service account to request OAuth access token to authenticate with REST API
// Default service account must have an appropriate role assigned or
// the request authentication will fail
const { access_token: accessToken } = await firebase.credential.applicationDefault().getAccessToken();
const uri = `https://firestore.googleapis.com/v1/projects/${projectId}/databases/(default):exportDocuments`;
const result = await request({
method: 'POST',
uri,
auth: { bearer: accessToken },
body: {
outputUriPrefix: bucket,
},
json: true,
});
// The returned operation name can be used to track the result of the long-running operation
// gcloud beta firestore operations describe "OPERATION_NAME"
const { name } = result;
console.info(`Export operation started ${name}`);
});
Don't forget to set the config and deploy
firebase functions:config:set backup.bucket="gs://BUCKET_NAME"
firebase deploy --only functions:backupOnPubSub
Firestore REST API permissions
Import/Export Rest API requires appropriate OAuth scopes to be enabled. You can achieve that by adding Cloud Datastore Import Export Admin
role to the default service account in your project.
gcloud projects add-iam-policy-binding PROJECT_ID \
--member serviceAccount:PROJECT_ID@appspot.gserviceaccount.com \
--role roles/datastore.importExportAdmin
Cloud Storage
In case you don't have a bucket yet you should create one. I've found that Firestore import/export requires at least regional storage class, coldline or nearline won't work.
gsutil mb gs://BUCKET_NAME/
Testing everything together
Once everything is set up you can trigger the whole process from the Cloud Scheduler console by clicking "Run now" button
Check logs in the Firebase Console or Stackdriver
Check export operation status
gcloud beta firestore operations describe "projects/PROJECT_ID/databases/(default)/operations/ASA3NDEwOTg0NjExChp0bHVhZmVkBxJsYXJ0bmVjc3Utc2Jvai1uaW1kYRQKLRI"
done: true
metadata:
'@type': type.googleapis.com/google.firestore.admin.v1.ExportDocumentsMetadata
endTime: '2019-03-09T21:04:39.263534Z'
operationState: SUCCESSFUL
outputUriPrefix: gs://PROJECT_ID.appspot.com/2019-03-09T21:04:32_85602
progressBytes:
completedWork: '5360'
estimatedWork: '4160'
progressDocuments:
completedWork: '40'
estimatedWork: '40'
startTime: '2019-03-09T21:04:32.862729Z'
And finally check the backup in the bucket
Next steps
Automatic backup is only useful if it's working reliably. It is a good idea to have monitoring and alerting for your backup operations.
One way to do this is to save the last export operation to Firestore and schedule a job some time after to check the result of the long-running operation. If the results is not successful it can send an email or log an error that will trigger an alert through Stackdriver Error Reporting.
Another improvement is enabling of automatic deletion of old backups after a certain time. This could be achieved with lifecycle management.
Closing remarks
Would love to hear you thoughts on this approach and how it could be improved.
Happy coding! Make backups :)
Posted on March 9, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024