Backing up and restoring the LGTM stack protects the configuration and data paths that make Grafana, Loki, Tempo, and Mimir usable after a cluster failure or migration. A recovery plan should cover Helm values, secret references, Grafana dashboards and data sources, and backend object storage.
Object storage usually holds the durable log, trace, and metric blocks, while Kubernetes holds release state, secret references, and service configuration. Grafana may also use a database or persistent volume that needs a separate backup path.
Do not call a backup complete until a clean namespace or staging cluster can restore the stack and query retained telemetry. A restore drill should prove that Grafana comes back with the expected data sources and that each backend can return at least one known signal.
Related: How to deploy a production LGTM stack
Related: How to upgrade and roll back the LGTM stack
Related: How to configure object storage for Mimir
Related: How to configure object storage for Loki
Related: How to configure object storage for Tempo
Tool: RPO/RTO Gap Calculator
$ mkdir -p ~/lgtm-backup/values ~/lgtm-backup/grafana $ chmod 700 ~/lgtm-backup
$ helm get values grafana --namespace monitoring --all \ > ~/lgtm-backup/values/grafana.yaml $ helm get values loki --namespace monitoring --all \ > ~/lgtm-backup/values/loki.yaml $ helm get values tempo --namespace monitoring --all \ > ~/lgtm-backup/values/tempo.yaml $ helm get values mimir --namespace monitoring --all \ > ~/lgtm-backup/values/mimir.yaml
Review exported values before storing them. Some charts can expose secret references or inline credentials depending on how the release was installed.
$ helm list --namespace monitoring > ~/lgtm-backup/helm-list.txt $ helm history grafana --namespace monitoring > ~/lgtm-backup/grafana-history.txt
$ curl --silent --user admin:<password> \ https://grafana.example.com/api/datasources \ > ~/lgtm-backup/grafana/datasources.json
$ curl --silent --user admin:<password> \ https://grafana.example.com/api/search \ > ~/lgtm-backup/grafana/dashboard-index.json
Use the organization's existing dashboard backup tool if one is already in place. The API index alone is not a full dashboard backup.
$ aws s3 ls s3://lgtm-loki-prod/ --recursive 2026-06-21 08:10:00 12000 chunks/tenant-a/... $ aws s3 ls s3://lgtm-tempo-prod/ --recursive 2026-06-21 08:10:01 18000 traces/... $ aws s3 ls s3://lgtm-mimir-prod/blocks/ --recursive 2026-06-21 08:10:02 22000 blocks/...
$ kubectl create namespace monitoring-restore namespace/monitoring-restore created $ helm upgrade --install loki grafana/loki \ --namespace monitoring-restore \ --values ~/lgtm-backup/values/loki.yaml \ --wait
$ helm upgrade --install tempo grafana/tempo-distributed \ --namespace monitoring-restore \ --values ~/lgtm-backup/values/tempo.yaml \ --wait $ helm upgrade --install mimir grafana/mimir-distributed \ --namespace monitoring-restore \ --values ~/lgtm-backup/values/mimir.yaml \ --wait
$ helm upgrade --install grafana grafana/grafana \ --namespace monitoring-restore \ --values ~/lgtm-backup/values/grafana.yaml \ --wait
$ helm list --namespace monitoring-restore NAME NAMESPACE STATUS grafana monitoring-restore deployed loki monitoring-restore deployed tempo monitoring-restore deployed mimir monitoring-restore deployed
$ curl --silent https://grafana-restore.example.com/api/health
{"database":"ok","version":"13.0.1"}
$ curl --silent --get https://metrics-restore.example.com/prometheus/api/v1/query \
--data-urlencode 'query=up'
{"status":"success","data":{"resultType":"vector","result":[]}}
An empty vector proves the API responded, not that historical data exists. Use a known retained series, log stream, and trace ID for the final restore drill.