Production Backup Strategy Overview

Overview

This production K3s cluster uses a comprehensive four-layer backup strategy to ensure data protection at different levels:

K3s etcd Snapshots - Control plane database backups
Longhorn Volume Backups - Persistent volume backups
Velero Cluster Backups - Application-aware cluster backups
CloudNative PG Backups - PostgreSQL database-consistent backups with point-in-time recovery

All backups are stored in Cloudflare R2, providing off-site redundancy and disaster recovery capabilities.

Layer	Schedule	Retention	Destination
K3s etcd	Daily at 1:00 AM	5 days	Cloudflare R2
Longhorn Volumes	Daily at 2:00 AM	7 days	Cloudflare R2
Velero Cluster	Daily at 3:00 AM	14 days	Cloudflare R2
CloudNative PG	Every 6 hours (4x/day)	30 days	Cloudflare R2

Before configuring backups, you need a Cloudflare R2 bucket and API credentials:

Create an R2 Bucket:
- In your Cloudflare dashboard, go to R2 and click Create bucket
- Give it a unique name (e.g., k3s-backup-repository)
- Note your S3 Endpoint URL from the bucket's main page: https://<ACCOUNT_ID>.r2.cloudflarestorage.com
Create R2 API Credentials:
- On the main R2 page, click Manage R2 API Tokens
- Click Create API Token
- Give it a name (e.g., k3s-backup-token) and grant it Object Read & Write permissions
- Securely copy the Access Key ID and Secret Access Key

You'll need these credentials for all three backup layers.

Each backup layer serves a specific purpose:

etcd Snapshots: Protect the Kubernetes control plane state (API objects, cluster configuration)
Longhorn Backups: Protect persistent volume data independently of cluster state
Velero Backups: Provide application-aware backups that capture both resources and volumes together
CloudNative PG Backups: Provide PostgreSQL-consistent backups with point-in-time recovery capabilities

This multi-layer approach ensures you can recover from different types of failures:

etcd Snapshots:

sudo k3s etcd-snapshot list

Longhorn Backups:

kubectl get recurringjobs -n longhorn-system
kubectl get jobs -n longhorn-system

Velero Backups:

kubectl get schedules -n velero
velero backup get

CloudNative PG Backups:

kubectl get backups -n <postgres-namespace>
kubectl get cronjobs -n <postgres-namespace>

Regularly verify that backups are completing successfully:

Review Velero backup logs:

kubectl logs -n velero deployment/velero

Check Longhorn backup jobs:

kubectl get jobs -n longhorn-system -l app=longhorn-manager