Supply Capacity Model — Findings¶
Author: automated analysis pipeline Date: 2026-05-03 Status: supply capacity model complete (sprint 1 skeleton + sprint 2 sourced inputs). allocation-layer handoff parameters at the bottom.
1. Summary¶
Under sourced base-case assumptions, global usable AI compute capacity grows ~46%/yr from 2024 through 2040, hitting ~1.7e+31 FLOP/year by 2040 — about three orders of magnitude above 2024. The binding constraint is capex through 2036, then chips for the remainder. None of the four scenarios reproduces the historical Rule A 2018+ rate of 5.97×/yr (~497% CAGR); the gap remains roughly 10 orders of magnitude by 2040.
Three honest readings of the historical-vs-supply gap:
- the historical compute trend is single-frontier-run growth, not aggregate supply growth. The two diverge if (and only if) frontier runs use a growing share of total compute — which they almost certainly did 2018–2024. the allocation layer is where this is quantified.
- The historical 6×/yr rate is unsustainable on supply fundamentals, regardless of share dynamics. Even capex-rich (50%/yr) is one full OOM slower per year than the historical baseline.
- Sourced inputs increased supply output growth by ~8 percentage points vs sprint 1's placeholders (38% → 46% base case CAGR), but did not close the gap. The conclusion holds.
| Scenario | 2024 (FLOP/yr) | 2040 (FLOP/yr) | CAGR | Binding 2030 |
|---|---|---|---|---|
| Baseline continuation | 3.97e+28 | 1.65e+31 | 45.7%/yr | capex |
| Capex-rich | 4.37e+28 | 2.89e+31 | 50.1%/yr | capex |
| Chip-constrained | 3.83e+28 | 6.54e+30 | 37.9%/yr | chip |
| Power/DC-constrained | 3.50e+28 | 6.64e+30 | 38.8%/yr | datacenter |
2. Data sources (sprint 2)¶
Source codes used in data/assumptions/supply_input_assumptions.yaml:
| Code | Source | Date |
|---|---|---|
epoch_2024_scaling |
Epoch AI, "Can AI scaling continue through 2030?" | 2024-08-20 |
iea_energy_ai |
IEA, "Energy and AI" report | 2025-04-10 |
nvidia_h100_spec |
NVIDIA H100 datasheet | — |
nvidia_ir |
NVIDIA Investor Relations / 10-K filings | various |
hyperscaler_10k |
Microsoft / Alphabet / Meta / Amazon 10-K capex disclosures | various |
stargate_announce |
OpenAI / MSFT / Oracle / SoftBank Stargate announcement | 2025 |
industry_typical |
Conservative industry-typical figures (PUE, server overhead, MFU) | — |
author_synthesis |
Multi-source synthesis with cited anchors | 2026-05 |
Confidence flags per row:
high— directly cited from named sourcemedium— derived/synthesized, anchored to cited figureslow— round-number placeholder retained from sprint 1 (≈18% of rows, mostly long-horizon 2040 values)
Most-anchored anchors: - 2024 global DC electricity: 415 TWh (IEA, ~1.5% of world total) - 2030 base-case DC electricity: 945 TWh (IEA) - 2024 NVIDIA H100/H200 shipments: 1.5–2M units (Epoch median) - 2030 H100-eq stock for training: 20M–400M units, median 100M (Epoch) - US AI-DC 2023: ~3 GW, total US DC 2030: ~90 GW (Epoch) - Power efficiency 2024→2030: 24× vs Llama 3.1 405B (Epoch)
3. Inclusion criteria (assumption table)¶
- Long-format CSV:
(parameter, scenario, year, value, unit, source, confidence, notes). - 173 rows covering 15 parameters × 4 scenarios at milestone years (2024, 2025, 2026, 2030, 2040), linearly interpolated between milestones.
- Constants stored as a single milestone (held constant across years).
- Values supplied to 2 significant figures except where the source uses 3.
- Comments (lines starting
#) used liberally in the CSV for source notes.
4. Model architecture¶
shipments_t (H100-eq)
↓ + linear retirement (5–6y depending on scenario)
installed_stock_h100e_chip_limited_t
┐ ┐ ┐
power_lim dc_lim capex_lim_t (each computed independently)
└─┬─┘─┬─┴─┬─┘
min: available_stock_h100e_t
↓ × peak_flops × seconds/year
available_compute_flop_year_pre_util
↓ × cluster_utilization
usable_compute_flop_year_t
Power vs DC differentiation (new in sprint 2):
power_limited_h100e = (ai_dc_mw × ai_share × 1000) / (chip_kw × server_oh × pue)
dc_limited_h100e = (ai_dc_mw × dc_packing_efficiency × 1000) / (chip_kw × server_oh × pue)
dc_packing_efficiency ∈ [0, 1] captures cooling, slot, transformer,
and permitting drag — the fraction of nominal AI-DC MW that can actually
host frontier accelerators. 1.0 in base/capex_rich/chip_bottleneck;
falls to 0.65 by 2040 in power_dc_bottleneck.
5. Headline results by scenario¶
Baseline continuation¶
- CAGR 2024→2040: 45.7%/yr. Reaches 1.65e+31 FLOP/yr by 2040.
- Binding constraint: capex through 2036, then chips 2037+.
- Stock: 3.18M H100e (2024) → 959M H100e (2040), ~300× growth.
- Capex required ramps from $0.2T/yr (2024) to ~$3T/yr (2040); coverage stays > 70% throughout (capex shortfall, not surplus).
Capex-rich acceleration¶
- CAGR 2024→2040: 50.1%/yr. Reaches 2.89e+31 FLOP/yr by 2040.
- Binding mostly chips (2024–2027) then capex (2028–2034) then chips again (2035+). The brief mid-period capex bind reflects cumulative capex falling behind cumulative chip-stock value.
- Stock: 3.5M (2024) → 1.68B (2040). Roughly 2× the base case by 2040.
Chip-constrained¶
- CAGR 2024→2040: 37.9%/yr. Reaches 6.54e+30 FLOP/yr by 2040 (~25% of base).
- Binding capex 2024–2026, chip 2027–2040.
- Stock: 2.73M → 349M.
- Driven entirely by Epoch's 20M-H100e lower-bound 2030 stock and a 6-year (vs 5-year) refresh cycle.
Power / data-center constrained¶
- CAGR 2024→2040: 38.8%/yr. Reaches 6.64e+30 FLOP/yr by 2040.
- Binding capex 2024–2026, datacenter 2027+ (sprint-2 differentiation activated — the new
dc_packing_efficiencyparameter does the work). - Stock: 2.8M → 387M.
- Power capacity itself never binds in any scenario under our 24× efficiency improvement — the binding is on physical buildout density, not raw MW.
6. Cost variant trajectories (historical-baseline carryover)¶
Per the historical baseline's most important finding (cost-variant divergence is bigger than rule-choice divergence), the supply capacity model carries three cost-per-H100e-year variants forward.
Base scenario, 2024 → 2040:
| Variant | 2024 USD/H100e/yr | 2040 USD/H100e/yr | Ratio |
|---|---|---|---|
| Upfront-amortized (chip × cluster_mult / lifetime) | 13,200 | 4,550 | 0.34× |
| Cloud-rental | 15,000 | 5,000 | 0.33× |
| Blended 50/50 | 14,100 | 4,775 | 0.34× |
Persistent ~10% gap between cloud and upfront, narrowing toward 2040.
Under chip-bottleneck the cloud rate runs ~50–80% higher than under
base case, peaking at the supply-tightest moments. Chart:
outputs/charts/supply_cost_per_h100e_by_variant.png.
7. Sensitivity (one-parameter perturbations of base case)¶
outputs/charts/supply_sensitivity_bands.png and
outputs/tables/supply_sensitivity_analysis.csv. Each parameter scaled
by {0.5, 0.75, 1.0, 1.5, 2.0} for all years, holding everything else
at base.
| Perturbation | 2040 usable compute multiplier vs base |
|---|---|
| Shipments × 0.5 | ~0.55× (chip becomes binding earlier) |
| Shipments × 2.0 | ~1.5× (capex binds harder, less than 2×) |
| AI-DC MW × 0.5 | ~0.85× (power doesn't bind in base; only matters if it does) |
| AI-DC MW × 2.0 | ~1.0× (no effect; power was slack) |
| Capex × 0.5 | ~0.55× (capex was already binding) |
| Capex × 2.0 | ~1.4× (chip starts binding earlier) |
Implication for the allocation layer: capex and shipments are the two highest-leverage parameters for the base case. AI-DC MW is slack under base assumptions and only matters when something else moves it in.
8. Key uncertainties¶
- Single-run vs aggregate compute confusion (most important). the historical baseline measures single training-run FLOP for a frontier model. the supply capacity model measures total annual usable global AI compute. Comparing them directly is not apples-to-apples. the allocation layer must explicitly model the share of total annual compute consumed by the largest single run.
- Hyperscaler capex AI-share bias. We treat ~75% of hyperscaler capex as AI infrastructure for 2025+. The actual share is debated; if it is 60% the base CAGR drops by ~3 percentage points.
- H100-equivalent unit definition. We've used FP16/BF16 dense peak FLOP/s as the conversion. For training-relevant workloads the memory-bandwidth-corrected H100-eq is ~10–15% lower per chip; for MoE inference it is higher. We use the dense definition throughout.
- 2030 stock estimate width. Epoch's 2030 H100-eq for training range is 20M–400M (20× spread). We use a midpoint and the model is approximately linear in this input.
- AI-DC MW coverage gap. IEA gives total DC TWh; Epoch gives US AI-DC GW. Implied global AI-DC for 2024 is ~10–15 GW with wide uncertainty bands. We anchor at 12 GW.
- Lifetime is scenario-dependent. A swing from 5 to 6 years reduces required shipments to maintain the same stock by ~17%; we keep this scenario-keyed.
- No geographic split. Global aggregate only; US, China, EU, RoW are intermixed. Power and capex are highly geographically concentrated and a regional split would tighten constraint identification materially.
- No quarterly grain. 2024 and 2025 are now fully observed and a quarterly calibration would tighten the modern-window levels.
9. Recommended allocation-layer handoff parameters¶
These are the explicit handoff parameters for the allocation layer (allocation across training, inference, AI R&D, post-training, reserves).
=== Annual usable compute capacity envelope (FLOP / year) ===
Year Base Capex-rich Chip-bot Power/DC-bot
2024 3.97e+28 4.37e+28 3.83e+28 3.50e+28
2025 1.05e+29 1.18e+29 8.86e+28 8.92e+28
2026 2.08e+29 2.69e+29 1.67e+29 1.79e+29
2027 3.55e+29 5.04e+29 2.78e+29 3.04e+29
2030 1.35e+30 2.46e+30 7.05e+29 1.04e+30
2035 4.93e+30 8.69e+30 2.15e+30 2.39e+30
2040 1.65e+31 2.89e+31 6.54e+30 6.64e+30
Read full CSV: outputs/tables/supply_fundamental_inputs_by_year.csv
=== Recommended the allocation layer base-case envelope ===
Use base_input_case as the central case.
Use chip_bottleneck as the slow / pessimistic floor.
Use capex_rich as the fast / optimistic ceiling.
Use power_datacenter_bottleneck as the alternative-bottleneck stress.
=== Available H100-eq stock by year (units, base case) ===
2024: 3.18e+06
2025: 8.62e+06
2026: 1.81e+07
2027: 3.07e+07
2030: 1.18e+08 ← anchored to Epoch median 100M ± 5x
2035: 4.30e+08
2040: 9.59e+08
=== Capex required by year (USD/yr, base case) ===
2024: 2.10e+11
2025: 3.50e+11
2026: 5.00e+11
2030: 1.50e+12
2040: 3.00e+12
Cumulative 2024-2040: ~$22T
=== Power capacity by year (MW, AI-dedicated, base case) ===
2024: 12,000
2025: 22,000
2026: 38,000
2030: 80,000 ← anchored to IEA + Epoch
2040: 300,000
=== Binding constraint by year (base case) ===
2024-2036: capex
2037-2040: chip
=== Cost per H100-eq accelerator-year (base, 2024 / 2040) ===
upfront-amortized: $13,200 → $4,550
cloud-rental: $15,000 → $5,000
blended: $14,100 → $4,775
(All three trajectories should be carried forward into the allocation layer — the
divergence is a real economic signal per historical-baseline findings.)
Known weaknesses (carry forward to allocation-layer documentation)¶
- Single-frontier-run share of total compute is the largest unmodeled quantity. the allocation layer must address this.
- 2030 H100-eq stock estimate has 20× spread between Epoch lower and upper bounds.
- ~18% of assumption rows are still
confidence=low— mostly the 2040 long-horizon values, which are extrapolations. - No geographic structure.
- Cumulative-capex limit treatment is simplified (we sum annual flows; depreciation / write-downs / refinancing are not modeled).
10. Open questions for the allocation layer¶
- What share of usable compute goes to a single largest training run by year? 2018: probably ~15% of frontier-lab compute. 2024: probably 5–10% globally. the allocation layer should make this explicit and variable.
- Training vs inference allocation. the allocation layer needs a split, ideally parameterized — base case probably ~35% training, ~55% inference, ~10% reserves/post-training as of 2024, with the inference share growing.
- AI R&D experiments share. This is the wedge that grows fastest if recursive self-improvement is real; the allocation layer should isolate it.
- Reserved capacity and cluster fragmentation. Not all H100-eq stock is fungible — reserved capacity, geographic fragmentation, and lab-specific pools all shrink the effectively-allocatable pool below the headline figure.
- Allocation under a binding constraint. When chips bind, who gets them? Hyperscalers, sovereign programs, frontier labs, and inference cloud customers all compete for the same pool. This is a allocation-layer question.
Appendix: deliverable checklist¶
| Spec deliverable | File | Status |
|---|---|---|
| Input assumptions file | data/assumptions/supply_input_assumptions.yaml |
✓ (173 rows, sourced) |
| Fundamental input model | model/supply_engine.py |
✓ |
| Scenario configs (4) | scenarios/supply_*.yaml |
✓ |
| Notebook / driver | pipelines/supply.py |
✓ |
| Accelerator stock chart | outputs/charts/supply_accelerator_stock_h100e.png |
✓ |
| Theoretical compute chart | outputs/charts/supply_theoretical_compute_capacity.png |
✓ |
| Usable compute chart | outputs/charts/supply_usable_compute_capacity.png |
✓ |
| Power capacity constraint chart | outputs/charts/supply_power_capacity_constraint.png |
✓ (with DC differentiation) |
| Capex required chart | outputs/charts/supply_capex_required.png |
✓ |
| Binding-constraint heatmap | outputs/charts/supply_binding_constraint_by_year.png |
✓ |
| Supply vs historical-baseline chart | outputs/charts/supply_vs_historical_compute_trend.png |
✓ |
| Cost-per-H100e by variant | outputs/charts/supply_cost_per_h100e_by_variant.png |
✓ (bonus) |
| Sensitivity bands | outputs/charts/supply_sensitivity_bands.png |
✓ (bonus) |
| Year-by-year fundamentals | outputs/tables/supply_fundamental_inputs_by_year.csv |
✓ |
| Scenario summary | outputs/tables/supply_scenario_summary.csv |
✓ |
| Binding constraints | outputs/tables/supply_binding_constraints.csv |
✓ |
| Capex requirements | outputs/tables/supply_capex_requirements.csv |
✓ |
| Sensitivity analysis | outputs/tables/supply_sensitivity_analysis.csv |
✓ (bonus) |
| the supply capacity model scope | docs/scope.md |
✓ |
| the supply capacity model initial notes | docs/supply_initial_notes.md |
✓ (sprint 1) |
| the supply capacity model findings memo | docs/supply_findings.md |
✓ (this file) |