If you cannot restore, you do not have backups

TL;DR: Backups you never test are stories. Restores are the truth. Set 3-2-1, define RPO and RTO, and run a quarterly restore drill.


 

Why this matters

People say, “We have backups.” Then an update fails or a server dies, and no one knows how to restore. Time burns. Trust dies. Cost multiplies. The only proof is a successful restore that someone else can repeat.

The real problems

  • Backups stored on the same server that failed

  • No off-site copy or encryption

  • No alert when a backup job fails

  • No runbook, so only one person knows the steps

  • No idea how long a restore actually takes

  • Credentials scattered across personal accounts

The fix

3-2-1: Three copies, two media types, one off-site. Keep one copy in a separate provider.
RPO: The maximum data you can afford to lose. Set it in hours.
RTO: The maximum time to be back online. Set it in minutes.
Automate: Nightly backups, weekly off-site, integrity checks.
Drill: Quarterly restore to staging. Time it, log issues, fix gaps.
Runbook: Who does what, where credentials live, exact steps with screenshots. Keep it in a shared drive and a password vault.
Access: Store keys in a shared vault. Do a monthly access test.

One-hour drill outline

  • Pick one site and one database.

  • Restore to a fresh staging instance with a distinct URL.

  • Verify login, pages, and a simple transaction or form.

  • Time the steps and note blockers.

  • Update the runbook and assign fixes.

Objections and answers

  • “We cannot spare the time.” You will spend ten times more during an outage.

  • “Hosting does backups.” Good. Assume they fail and keep your own.

  • “We have never had a problem.” That is not a plan.

60-second test

Restore one file or a small database to staging right now. How long did it take and who signed off?

Metrics to watch

RPO achieved, RTO achieved, last successful drill date, number of restore steps that are manual, people who can run the process.

One-page checklist

  • 3-2-1 in place

  • RPO and RTO defined

  • Nightly automated jobs

  • Off-site encrypted copy

  • Alerts on backup failure

  • Quarterly restore drill

  • Runbook stored and known

  • Credential vault verified

  • Two people able to run a restore

Scroll to Top