Disaster Recovery: Defining RTO & RPO for Executives
There is a dangerous conversation that happens in every company.
The CEO asks the CTO: "Are our backups working?"
The CTO says: "Yes."
They both leave the room happy. But they are talking about two completely different things.
- The CTO means: "We have a cron job that dumps the database to S3 every night at midnight."
- The CEO means: "If the server explodes at 4:00 PM, I assume we lose zero data and are back online instantly."
When the inevitable crash happens, this misalignment turns into a career-ending event.
As a technical leader, your job is not just to "do backups." It is to negotiate the price of downtime. You must define two acronyms that translate technical risk into dollars: RTO and RPO.
1. The Definitions: Time vs. Data
You must strip away the jargon. Explain it to the Board like this:
RPO (Recovery Point Objective) = "How much data can we afford to lose?"
If we crash at 4:00 PM, and our last backup was at 12:00 PM, we have lost 4 hours of data.
- Question to CEO: "Are you willing to re-enter 4 hours of orders manually? Or do you need the data to be live up to the last second?"
RTO (Recovery Time Objective) = "How long can we be dead?"
From the moment the server crashes, how many minutes/hours can pass before the "Buy" button works again?
- Question to CEO: "Does a 4-hour outage kill the company, or is it just an annoyance?"

2. The Cost Curve: The Price of Zero
The CEO's natural instinct is to say: "I want zero data loss and zero downtime."
Your answer is: "We can do that. It costs $50,000 a month."
This is the Asymptotic Cost of Availability.
- 99% Availability (RPO 24h / RTO 24h): Cheap. A nightly script. Cost: $100/mo.
- 99.9% Availability (RPO 1h / RTO 4h): Moderate. Database replication. Cost: $1,000/mo.
- 99.999% Availability (RPO 0s / RTO 0s): Expensive. Multi-region Active-Active clusters with real-time sync. Cost: $50,000/mo.
You must present Disaster Recovery as a Menu, not a binary switch.
3. The Strategy: The Tiered Menu
Don't treat all data equally. A "One Size Fits All" DR strategy is either too risky (for payments) or too expensive (for logs).
Present this table to your Executive Team:
Table 1: The Disaster Recovery Service Levels
| Tier | Workload | RPO (Data Loss) | RTO (Downtime) | Architecture | Cost |
| Platinum | Payments / Orders | ~0 Seconds | < 5 Mins | Multi-AZ, Auto-failover RDS, Hot Standby. | $$$$ |
| Gold | User Profiles / Inventory | 15 Mins | 1 Hour | Read Replicas, frequent snapshots. | $$ |
| Silver | Analytics / Reporting | 24 Hours | 48 Hours | Nightly S3 Dumps. Restore on demand. | $ |
| Bronze | Dev / Staging | Best Effort | 1 Week | Infrastructure as Code (Rebuild from scratch). | $0 |
When you frame it this way, the CEO will quickly decide that the "Marketing Blog" does not need Platinum-tier protection. You just saved the company money while clarifying the risk.
4. The Trap: The "Restore" Test
Having a backup is meaningless. Restoring is the only thing that counts.
Schrödinger’s Backup states: "The condition of any backup is unknown until a restore is attempted."
I have seen companies with "perfect" backups fail during a disaster because:
- The encryption key for the backup was on the server that crashed.
- The backup file was corrupted 6 months ago, and no one checked.
- The restore process took 18 hours to download the file (violating the 4-hour RTO).
The Executive Protocol:
- Mandate a Quarterly Fire Drill.
- Actually restore the production database to a staging environment.
- Time it. If it takes 6 hours, and your RTO is 4 hours, you are failing compliance. Report this to the Board immediately as a risk to be mitigated.
Summary
Disaster Recovery is not a technical problem; it is an Insurance Policy.
Your job is to act as the Insurance Broker.
- Define the terms (RTO/RPO).
- Quote the premiums (Cost of Architecture).
- Let the Business decide the coverage.
If the business chooses "Silver Tier" coverage and the site goes down for 24 hours, you are not incompetent; you are compliant with the agreed policy. That is the difference between being fired and being a strategic partner.
No spam, no sharing to third party. Only you and me.
Member discussion