A shoe company taught me most of what I needed to know about neocloud.
This spring, Allbirds — the sustainable-sneaker brand, the one that makes shoes out of wool — announced it was selling off its footwear business to reinvent itself as "NewBird AI," a GPU-as-a-Service cloud provider. The stock jumped roughly 580% in a single day. The money behind the pivot was about $50 million, which, at thirty to forty thousand dollars per NVIDIA H100, buys you on the order of fifteen hundred GPUs — a rounding error against what a real AI cloud runs on. The market didn't care. You could bolt "AI cloud" onto a wool-shoe company and six-x the valuation by lunch.
That's the noun. Hold onto it, because the entire discipline here is learning to tell the noun apart from the thing underneath it.
What "neocloud" claims to be
A neocloud is a cloud built from the ground up around GPU-as-a-Service: bare-metal performance, fast interconnect, a deliberately narrow catalog — set against the hyperscalers' sprawling menus. The spelling isn't even settled yet (neocloud, neo-cloud), which is its own small tell: this is a coinage still drying.
The field is real and crowded. CoreWeave is the bellwether; Lambda, Crusoe, Nebius, Nscale, RunPod, and Vast.ai round out the front rank, with something like 190 distinct operators worldwide. It's also still a minority of the market — hyperscalers held north of 80% of global AI compute as of early 2025, and credible GPUaaS revenue estimates cluster around $23–25 billion for 2025. (You'll see far bigger numbers thrown around — a "$250 billion by 2030" figure made the rounds, but it doesn't survive contact with the analyst's own press release, which says closer to $65 billion. When the market-sizing is this inflated, that itself is data.)
But the workload is real
Here's where you have to be honest, because the lazy take is to wave the whole thing off as hype. It isn't. The cost advantage is structural and explainable, not vendor fiction.
The raw numbers: specialist H100 capacity runs roughly $2.49–6.16 per GPU-hour (Lambda near the bottom, CoreWeave near the top) against AWS B200 at about $14.24. That gap comes from real engineering. Stripping the hypervisor — true bare metal — recovers the 15–25% tax that virtualization imposes. InfiniBand fabric delivers up to 10x faster training than commodity 10-gig Ethernet, which is the difference between a training step at 4.4 seconds and one at 39.8. Egress that costs $0.08–0.12 a gigabyte at a hyperscaler comes bundled. Purpose-built bill-of-materials and network design shave another ~11% off cluster CapEx.
And the clincher, the fact that should end any "it's all hype" argument: Microsoft itself rents on the order of $200 million a month of GPU compute through neoclouds — while operating its own datacenters. When the largest cloud on earth is a customer of the upstarts, the workload underneath the noun is unmistakably real.
One caveat worth internalizing, because it trips people up: the headline per-hour price is not cost-per-unit-of-work. Newer silicon can win on cost-per-token while losing on sticker price, and utilization (pipeline optimization can cut training cost ~40%) and commitment (reserved contracts, 30–60% off) move the real number as much as the chip does.
The edge is already eroding
So the workload is real. But — and this is the part the binary "noun vs. workload" framing misses — real doesn't mean durable.
Once you account for power, labor, and depreciation, bare-metal-as-a-service margins collapse to 14–16%. Price-tracking across two dozen marketplaces already shows commoditization pressure as GPU supply expands. The honest read is that raw GPU access is on its way to becoming a commodity, and the durable value is migrating up — to orchestration, networking, and plain operational reliability. A 512-GPU cluster has a mean time between failures measured in days; "the drivers and scheduler just work, the cluster doesn't fall over" is what lets the good operators hold a premium. The early neoclouds that sold nothing but stacked hardware mostly died. That's the tell that the moat was never the silicon.
The noun has a balance sheet
The other half of the honest picture: the "noun" isn't just marketing. It's leverage.
CoreWeave's debt grew by nearly $3.5 billion in a single quarter, to about $24.86 billion; interest expense alone is already 25.8% of revenue and climbing. The model is debt collateralized by GPUs — an asset that has depreciated 60–75% from peak. Roughly 62% of CoreWeave's 2024 revenue came from a single customer (Microsoft). And the whole sector hums with circular financing: hyperscalers shifting capex into opex via neocloud commitments, the chip vendor extending financing that flows back into orders for its own chips. None of that makes the workload fake. It does mean the business is a bet on AI capex staying hot and interest rates behaving — and it's why a wool-shoe company could raise fifty million and watch its stock detonate upward. The hype has a balance sheet, and you should read it.
Cloud's oldest move
Step back far enough and "neocloud" stops looking new at all. It's the latest rung on a ladder cloud has been climbing for fifteen years: the perpetual re-bundling of compute, storage, and network around whatever workload and silicon dominate the moment.
Watch it on two axes. Billing granularity: EC2 instances became spot instances (AWS quietly killed the real auction market in 2017), became per-second billing, became Lambda functions metered by the millisecond. Silicon: x86 became ARM — AWS's Graviton line marched from 16 cores to 192, each generation tuned harder at the workload, Graviton3 landing up to 3x better machine-learning performance — became GPU, became TPU, and is now becoming inference-specific ASICs. The engine underneath is physics: the end of Dennard scaling and the slowing of Moore's Law mean you stop getting free general-purpose speed, so you specialize. Every step trades generality for efficiency at the dominant workload, gets a fresh name, and gets sold as a revolution.
Neocloud is the GPU rung. I've watched this rerun enough times to recognize the shape before the marketing does.
The moving target
Which raises the only question that really matters for anyone signing a contract: how long is this rung going to hold?
The next one is already visible. Meta is in talks with Google for a multi-billion-dollar TPU deployment. Anthropic has committed to over a million TPUs. TPUs already post 15–30x the performance and 30–80x the performance-per-watt of legacy general-purpose chips at their target workload, and as inference overtakes training, more of the work moves onto cheaper, more specialized silicon. Further out sit photonic and neuromorphic designs with eye-watering efficiency numbers, and quantum, which remains genuinely far off.
The risk this creates for a single-bet, GPU-only neocloud is precise: getting stranded on a silicon generation — which is the same obsolescence that threatens the debt. But don't overcorrect into the dismissive take either: about 70% of Azure's AI workloads still ran on NVIDIA late last year, NVIDIA shipped some six million Blackwell GPUs in a year, and the CUDA ecosystem is a moat unto itself. The swing is real. It is not instant.
The questions that outlast the noun
Strip the branding off any "new cloud" — neocloud today, whatever's next in three years — and the architect's job collapses to five questions:
- What's the real primitive being sold — and is it the raw resource, which commoditizes, or the operational layer around it, which doesn't?
- What's the billing granularity, and does it actually match my workload's duration and utilization profile?
- What's the lock-in half-life — across silicon, ecosystem, and contract — and what's my exit when the pendulum swings?
- What happens when the silicon shifts — am I buying compute, or am I buying a wager on one chip generation?
- Who absorbs this — does the specialist persist, or do the hyperscalers (and a design-services oligopoly quietly consolidating around one vendor) re-absorb the niche? Follow the balance sheet, not the brand.
So: is neocloud real?
Yes, and it doesn't matter — because that was never the right question. Neocloud is the GPU rung of cloud's permanent re-bundling ladder: a real workload shift, wrapped in a financing story, whose raw-compute advantage is already commoditizing toward the operational layer above the silicon. The architect ignores the noun, dates the workload's half-life, and watches for the next swing.
These are the patterns that outlast the next round of vendor announcements. "Neocloud" is a vendor announcement. Buy the pattern, not the noun.
