AWS GitHub & S3 Backup
pablosalas81
Posted on March 10, 2024
Introduction:
Data backup and disaster recovery services are critical aspects of protecting a business’s most asset — its data.
Losing this data can result in severe consequences, including financial loss, reputational damage, and operational disruptions. Therefore, it’s essential to understand the importance of data backup and disaster recovery planning and implement effective strategies to safeguard your business’s assets.
AWS S3 BACKUPS STORAGE:
Always consider both folders:
Backup procedures:
Github:
Docker file:
FROM debian:12-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y \
awscli
COPY script.sh /usr/local/bin/script.sh
RUN chmod +x /usr/local/bin/script.sh
CMD ["/usr/local/bin/script.sh"]
Script:
!/bin/bash
Directory paths
SOURCE_DIR="/app/storage"
DEST_DIR="/app/backup"
TIMESTAMP=$(date "+%Y-%m-%d")
Ensure destination directory exists
mkdir -p "$DEST_DIR"
Count total directories, excluding lost+found
TOTAL_DIRS=$(find "$SOURCE_DIR" -mindepth 1 -maxdepth 1 -type d ! -name 'lost+found' | wc -l)
Counter for processed directories
PROCESSED_DIRS=0
RETENTION_PERIOD=13
List directories in the source directory
for dir in "$SOURCE_DIR"/*; do
if [ -d "$dir" ]; then
# Get directory name
dir_name=$(basename "$dir")
# Skip the lost+found directory
if [ "$dir_name" = "lost+found" ]; then
continue
fi
echo "Processing directory: $dir"
# Copy directory to destination
cp -r "$dir" "$DEST_DIR/$dir_name"
# Compress the copied directory
tar -czf "$DEST_DIR/$dir_name.tar.gz" -C "$DEST_DIR" "$dir_name"
# Upload to AWS S3
aws s3 cp /app/backup/${dir_name}.tar.gz s3://${BUCKET}/files/${TIMESTAMP}/${NAMESPACE}/${dir_name}/${dir_name}.tar.gz
# Clean up
rm -rf "$DEST_DIR/$dir_name" "$DEST_DIR/$dir_name.tar.gz"
# Increment processed directories counter
PROCESSED_DIRS=$((PROCESSED_DIRS+1))
# Calculate and display progress
PROGRESS=$(( (PROCESSED_DIRS * 100) / TOTAL_DIRS))
echo "Progress: $PROGRESS% ($PROCESSED_DIRS/$TOTAL_DIRS directories processed)"
# Deleting folders older than retention period
RETENTION_DATE=$(date -d "${TIMESTAMP} -${RETENTION_PERIOD} days" "+%Y-%m-%d")
# List all date folders
FOLDER_LIST=$(aws s3 ls s3://${BUCKET}/files/ | awk '$0 ~ /PRE/ {print $2}' | grep -E '^[0-9]{4}-[0-9]{2}-[0-9]{2}/' | sed 's/\/$//')
# Loop through each folder and delete if older than retention period
for folder in $FOLDER_LIST; do
FOLDER_TIMESTAMP=$(date -d "${folder}" "+%s")
RETENTION_TIMESTAMP=$(date -d "${RETENTION_DATE}" "+%s")
if [ $FOLDER_TIMESTAMP -lt $RETENTION_TIMESTAMP ]; then
aws s3 rm s3://${BUCKET}/files/${folder}/ --recursive --quiet
fi
done
fi
done
The backup retention period is 14 days, you can restore back any day you want.
YAML file:
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-files
namespaces: backup
spec:
schedule: "0 3 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup-container
image: ghcr.io/backup/backup-files:latest
env:
- name: NAMESPACE
value: "backup"
- name: BUCKET
value: "nombre del bucket"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_SECRET_ACCESS_KEY
volumeMounts:
- name: my-pvc
mounthPath: /app/storage
restartPolicy: OnFailure
imagePullSecrets:
- name: registry-credentials-back
volumes:
- name: my-pvc
persistentVolumeClaim:
claimName: backend-upload-storage-pvc
backoffLimit: 4
Postgres YAML file:
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-files
namespaces: backup
spec:
schedule: "0 3 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup-container
image: ghcr.io/backup/backup-files:latest
env:
- name: NAMESPACE
value: "backup"
- name: BUCKET
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: host
- name: PG_PORT
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: host
- name: PG_USER
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: user
- name: PG_PASS
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: password
- name: BUCKET
value: "nombre del bucket"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_SECRET_ACCESS_KEY
restartPolicy: OnFailure
imagePullSecrets:
- name: registry-credentials-back
backoffLimit: 4
Thank you for your time.
Posted on March 10, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024