Cloud Exit Strategy: When Repatriation Actually Makes Sense

7 min read

Last updated:

Cloud computing concept with abstract data center visualization
Photo by Hannah Wei on Unsplash

Repatriation has stopped being a contrarian opinion and started becoming a line item in board decks. The 37signals migration off AWS is now a four-year case study, Dropbox’s Magic Pocket continues to print savings, and a steady drip of mid-market companies are quietly pulling stateful workloads out of hyperscalers. If you are a CTO heading into a 2027 budget cycle, your CFO has already read the headlines. The question is no longer whether repatriation can work. It is whether it works for you, and whether your team can execute it without breaking production.

This is a decision framework, not an argument. Cloud is still the right answer for most workloads at most companies. But the universal default of the 2015 to 2022 era is gone, and pretending otherwise costs real money.

The Cost Model You Are Probably Missing

Most cloud bills look reasonable when you compare on-demand compute to a depreciated server. They stop looking reasonable when you account for the full stack: egress, idle reservation overhead, premium storage tiers, managed service multipliers, support contracts, and the platform engineering team you hired to manage it all. A useful rule of thumb is that the visible compute and storage line items represent fifty to sixty percent of true spend. The rest sits in network, observability, security, and the FinOps overhead required to keep the visible spend from doubling every quarter.

Egress is the single line item most teams underestimate. AWS charges around nine cents per gigabyte for the first ten terabytes, dropping to roughly five cents at petabyte scale. A media company moving two petabytes of finished video out of S3 every month is paying close to one hundred thousand dollars a month in egress alone, before they touch a single CPU. The same data sitting on a Backblaze B2 bucket with Cloudflare R2 in front of it costs close to nothing to serve.

Managed service multipliers are the second blind spot. RDS for Postgres typically runs about two and a half times the cost of an equivalent EC2 instance running self-managed Postgres. OpenSearch is closer to three times. Aurora can be four times for write-heavy workloads. These multipliers are often worth it for teams that genuinely cannot run a database. They are wildly expensive for teams that already employ database administrators and have predictable, well-understood workloads.

Workload Categories That Actually Benefit From Repatriation

Not every workload is a repatriation candidate. The ones that consistently come out ahead share three properties: predictable utilization above sixty percent, large data gravity, and limited need for the elastic burst capacity that justified cloud in the first place.

  • Steady-state stateful databases with greater than two terabytes of data and predictable IOPS. The cost gap versus self-managed on commodity NVMe is severe.
  • Bulk object storage serving high-bandwidth content. Egress economics dominate, and CDN-fronted alternatives like R2, B2, and Wasabi are mature.
  • Batch ML training on stable model architectures. Once you know your t
    Datacenter aisle with humming racks and cool blue lighting
    Photo by imgix on Unsplash
    raining cluster size, owning A100 or H100 boxes pays back in twelve to eighteen months versus on-demand GPU pricing.
  • Internal data platforms running Spark, Trino, or Druid where your team already operates the engine and the cluster runs twenty-four seven.
  • CI build farms with predictable peak capacity. GitHub Actions and CodeBuild minutes add up fast at scale.

Hybrid Patterns That Have Stopped Being Theoretical

Stateful On-Prem, Stateless Cloud

This is the dominant pattern for serious mid-market repatriation in 2026. Databases, object stores, and data warehouses move to colocation facilities with Equinix, CoreSite, or Digital Realty. Stateless application tiers, edge functions, and burst capacity remain on hyperscalers. AWS Direct Connect or Azure ExpressRoute provides the backbone, typically at one to ten gigabits with cross-connect fees in the low thousands per month.

Owned Iron For Compute, Cloud For Control Plane

Kubernetes at scale on owned hardware, managed via cloud-hosted control planes such as EKS Anywhere, GKE Anthos, or Rancher. Teams keep the operational ergonomics of cloud-managed Kubernetes while paying commodity prices for the actual nodes. Works particularly well with a Talos or Bottlerocket operating system base and a Cilium data plane.

Sovereign Region Plus Public Cloud Burst

Driven by data residency more than cost. EU customer data lives in a sovereign region or on-prem facility within jurisdiction. Compute-only workloads burst to the nearest hyperscaler region for elasticity. The architectural cost is real, but for regulated industries the alternative is being shut out of the market entirely.

Our Recommendation

Run the analysis on a per-workload basis, never on the cloud account as a whole. Build a true cost-per-unit model for each major service: cost per query for your warehouse, cost per gigabyte served for your storage, cost per inference for your ML serving stack. Compare against a fully loaded on-prem alternative that includes hardware amortization over four years, colocation rent, network transit, hands-and-eyes contracts, and the engineering headcount required to operate it.

If a workload shows a three-times or greater cost advantage on owned infrastructure and represents more than five percent of your total cloud spend, it is a candidate. Anything below that threshold is not worth the operational complexity. Start with one workload, prove the operational model, then expand. Companies that try to repatriate everything at once almost always fail.

Repatriation is an operating model decision, not a procurement decision. If your team has never run physical infrastructure, the cost savings will be eaten by incident response and capacity planning mistakes for the first eighteen months.

Server racks viewed in perspective inside an enterprise data center
Photo by Marc PEZIN on Unsplash
>When Repatriation Is The Wrong Move

Cargo-cult repatriation is real and expensive. The 37signals story is convincing, but 37signals had three things most companies do not: a stable workload profile, deep operational expertise from running their own infrastructure for two decades, and a CEO willing to absorb the political risk of being wrong in public. Without all three, you are buying their headline without their substrate.

Skip repatriation if your workload utilization swings more than three to one between peak and trough. The unused capacity will erase any unit-cost advantage. Skip it if you depend heavily on managed services that do not have credible self-hosted equivalents, such as DynamoDB at petabyte scale, Lambda for event fan-out, or Bedrock for rapid model swapping. Skip it if your engineering team is under fifty people, because the operational overhead will swallow your roadmap. Skip it if you are pre-product-market-fit, because optimization at that stage is malpractice.

The honest middle position in 2026 is this: most companies should stay on cloud for most workloads, aggressively negotiate enterprise discount programs, and run a hard FinOps practice. A subset of companies with the right workload mix and operational maturity should repatriate two to four specific workloads and capture meaningful savings. A small number of companies should go fully off-cloud. Knowing which group you are in is the entire decision.

The Operational Reality of Owning Hardware Again

The procurement timeline alone is a culture shock for teams that have only known cloud. Lead times for high-density GPU servers in 2026 still range from twelve to twenty weeks for H100 and B200 configurations. Standard compute nodes from Dell, Supermicro, or HPE deliver in eight to twelve weeks. You will need to sign multi-year colocation contracts, often with capacity commitments that look more like real estate than IT. Cross-connects, IP transit, and remote-hands contracts each carry monthly minimums and notice periods. The contractual surface area is meaningful, and underestimating it is the most common cause of repatriation projects that ship six months late.

The skills gap is real. The discipline of capacity planning, the muscle memory of bare-metal provisioning via Tinkerbell or MAAS, the operational rhythm of firmware updates and disk failures, all of these atrophied in the cloud era. Hiring for them in 2026 is harder than it was a decade ago because a generation of engineers has not done this work. The credible path is to partner with a managed colocation provider for the physical layer, retain platform engineering for the orchestration layer, and pay the premium for the years it takes to rebuild the internal capability.

Finally, repatriation is reversible only at significant cost. Once you have signed colocation contracts and bought hardware, the optionality you had in cloud is gone for the duration of the depreciation cycle. If your business plan changes, if you pivot, if you get acquired, the sunk cost of the on-prem footprint becomes friction. Plan accordingly: repatriate workloads whose shape you are highly confident in, not workloads that are still finding their architectural form.


Talk to the team

Frameworks scale better when they meet real constraints. If you are facing this decision in production, write to us.