Infrastructure
Cloud vs On-Premises: An Honest Decision Framework
A cloud vs on-premises framework built on real constraints: egress costs, data gravity, compliance, latency, and staffing — and why hybrid is the usual answer.
Executive summary
The cloud versus on-premises decision is a constraint problem, not an ideology problem: egress economics, data gravity, compliance regimes, latency physics, and the staff you can actually hire decide it long before preference does. I spent twelve years running a hosting company on owned hardware, operated US ITAR cloud environments, and have led cloud accounts at enterprise scale — and I have watched both all-in positions fail expensively. This article gives the TCO math people skip, the five constraints that dominate the decision, and why a deliberate hybrid posture is the defensible default for most established organizations.
Stop arguing ideology; start pricing constraints
Cloud versus on-premises stopped being an interesting ideological debate years ago. Organizations keep making the decision ideologically anyway — all-in cloud because a slide said so, or back to the datacenter because the last cloud bill hurt.
Cloud is a deployment model, not an objective.
I have lived on every side of this. Twelve years running a hosting company on hardware I owned, powered, and answered for. ITAR-regulated cloud operations. Enterprise accounts where the monthly cloud spend has more digits than most salaries. The pattern across all of it is consistent: the decision is made by constraints, and the constraints are knowable in advance. Price them honestly and the answer usually writes itself — per workload, not per company.
The TCO math people skip
Both sides of the comparison are routinely miscounted, in opposite directions.
The on-prem column undercounts people and time. Hardware is the visible line. The real budget includes power and cooling, facility or colo space, support contracts, the 3–5 year refresh cycle, and — dominant for most organizations — the salaried staff to run it around the clock. A credible on-call rotation is four to six engineers, and that floor exists whether you run two racks or twenty. The same fixed cost is why on-prem improves with scale and utilization. The hosting business taught me that owned hardware run hot, at steady load, over its full life is very hard to beat on cost. It also taught me what that discipline costs in people, which is the half of the equation the hardware enthusiasts skip.
The cloud column undercounts the meter. List price per VM looks competitive; the bill arrives with egress, cross-AZ traffic, NAT gateway processing, IOPS, snapshots, and inter-service calls attached. On the enterprise accounts I have led, the pattern is reliable: compute is the anchor tenant, but data transfer and managed-service line items are where budgets go to die. Add the softer costs — commitment discounts that quietly recreate the capacity planning cloud was supposed to eliminate, and architectures that weld themselves to proprietary services one convenient API at a time.
The honest comparison is total cost per workload over 3–5 years, both columns fully loaded, at your utilization. Not the brochure’s.
Five constraints that decide it
1. Egress and data gravity
Moving data into a cloud is free. Moving it out is not — egress commonly sits in the $0.05–0.09/GB band at retail. That asymmetry is not an accident of pricing; it is a strategy, and it creates data gravity: every terabyte you land with a provider raises the price of leaving, and every new application finds it easier to deploy next to the data than to pay to move it. Treat the placement of your primary data stores as the decision with the longest tail, because it is. Workloads that continuously ship large volumes outward — media, backups, telemetry — deserve an explicit egress model before any placement decision. They are also the workloads most often repatriated later, by people who skipped that step.
2. Compliance and jurisdiction
Some regimes make the decision for you. Having run US ITAR cloud operations, I can attest the constraint is broader than “use the government region”: ITAR restricts access to US persons, which reaches into who staffs your support queue, where logs replicate, which vendors hold credentials, and how disaster recovery is architected. Data residency laws and sector rules — healthcare, payments, critical infrastructure — draw similar hard walls. The framework rule: identify regulated data first, place it first, and let everything else orbit that decision. Retrofitting compliance onto an already-placed workload costs multiples of doing it in order.
3. Latency and physics
The speed of light is not negotiable.
Workloads coupled to a physical process — plant floors, trading, real-time control, heavy office traffic to a local file store — belong near the process. I spent years supporting petrochemical plant networks, and “the control system’s dependencies live 600 miles away behind an internet circuit” is not an architecture; it is an incident report waiting for a datestamp. Conversely, a global user base is a latency argument for cloud edges. Either way the WAN carries the whole decision — placement choices are only as good as the network architecture connecting them.
4. Elasticity and workload shape
Steady 24/7 load at high utilization favors owned capacity. Spiky, seasonal, or experimental load favors rented capacity — paying for peak hardware that idles 340 days a year is how on-prem loses. The discipline is being honest about which shape each workload really has, because optimism about utilization is the oldest error in capacity planning. Most estates contain both shapes, which is the quiet argument for hybrid.
5. Staffing and the market you hire in
The rarest resource is not hardware or credits. It is engineers. Running on-prem well requires virtualization, storage, network, and facilities skills that get harder to hire every year. Running cloud well requires FinOps and platform engineering skills that are expensive in a different way. The constraint binds in both directions, and the honest question is which team you can actually build and retain in your market. An architecture your staff cannot operate is wrong no matter how elegant the diagram — because every architecture eventually becomes an operations problem, and the operators decide whether it succeeds.
The framework, compressed
| Constraint | Pushes on-prem / dedicated | Pushes cloud |
|---|---|---|
| Cost shape | Steady, high-utilization, predictable | Spiky, uncertain, short-lived |
| Data | Large, heavy egress, gravity you control | Small, or born-in-cloud |
| Compliance | Access/residency restrictions (ITAR-class) | Certifications you inherit (with care) |
| Latency | Coupled to a site or process | Globally distributed users |
| Staffing | Strong infra team in place | Small team, platform skills |
Score each significant workload against the table. Portfolios almost never land on one side — which is the point.
Hybrid as the default, but deliberately
The defensible default for an established organization is hybrid: regulated, steady, data-heavy workloads on owned or dedicated infrastructure; elastic and experimental workloads in cloud; both under one operating model — one identity plane, one observability stack, one infrastructure-as-code discipline spanning both sides. That last clause is what separates deliberate hybrid from accidental hybrid. Accidental hybrid — two estates, two toolchains, two half-staffed teams — is the most expensive architecture in the industry, and it is also the most common.
Deliberate hybrid needs three artifacts. A placement policy anyone can apply without convening a committee. A network and identity fabric designed for two homes from the start — the shape I document in the hybrid cloud landing zone. And an exit-cost line item for every major placement: what it would cost, in dollars and months, to move this workload. You will rarely spend it. Knowing it changes how you negotiate and how you architect.
What to write down
- The fully loaded 3–5 year TCO for your top ten workloads, both columns, with utilization assumptions stated.
- The placement policy: which constraints force which home, signed by someone accountable.
- The regulated-data inventory and where each class is allowed to live.
- Exit cost per major placement, revisited annually.
Neither the cloud bill nor the datacenter is the enemy. The enemy is placement by default. The cure is a framework boring enough that the next workload gets placed by constraint in an afternoon instead of by conviction in a quarter-long steering committee — and boring frameworks, like boring infrastructure, are the ones that quietly keep working.
Frequently asked questions
- Is on-premises cheaper than cloud?
- For steady, predictable, high-utilization workloads, usually yes — often substantially — once you count cloud egress and the premium for always-on capacity. For spiky, uncertain, or fast-changing workloads, elasticity typically wins. The honest comparison loads the on-prem column with staff, power, facilities, and refresh cycles, and loads the cloud column with egress, cross-AZ traffic, and the managed services you will quietly grow dependent on.
- What is data gravity and why does it matter?
- Data gravity is the tendency of applications, analytics, and new projects to accumulate around wherever large data already lives, because moving the data is slow and expensive. Egress fees turn gravity from a technical effect into a commercial mechanism: every terabyte you place with a provider raises the cost of ever leaving. That makes the placement of your primary data stores the most strategic infrastructure decision you own.
- When does compliance force on-premises or a special cloud region?
- When the regime restricts who may access data or where it may reside. ITAR, for example, restricts access to US persons, which pushes workloads into government or sovereign cloud regions or onto controlled on-premises infrastructure. Having operated under that regime, I can say the constraint reaches past servers into support staffing, logging pipelines, and every vendor holding a login. Data residency laws and some sector rules draw similar hard walls.
- Why is hybrid cloud the default answer?
- Because real portfolios contain both kinds of workloads: steady, data-heavy, or regulated systems that favor owned or dedicated infrastructure, and elastic, experimental, or globally distributed systems that favor cloud. A deliberate hybrid — each workload placed by constraint, connected by a well-designed network under one operating model — beats both all-in positions. Accidental hybrid, with no placement rules, delivers the costs of both and the benefits of neither.