Glossary¶
Definitions of core terms used throughout this project. Where two quantities are similar but not the same, the entries explicitly call out the distinction.
Frontier training run¶
A single major training run for a frontier AI model — for example, the training run that produced GPT-4, Claude 3 Opus, or Gemini 1.5 Pro. The historical baseline tracks one row per such run from Epoch's "Notable AI Models" dataset.
This is not the same as total annual AI compute. A frontier run consumes only a share of the global AI compute available in its release year.
Frontier training compute¶
The total floating-point operations performed during a single frontier training run. Stored in absolute FLOP. Typical 2024 frontier runs are in the 1e25–1e26 FLOP range.
This is what the historical baseline measures via training_compute_flop.
It is not the same as usable_compute_flop_year (which is the
supply-side annual total).
Total usable AI compute¶
The total amount of AI compute available globally in one year, after accounting for chip stock, power, data-center capacity, capex, and utilization. The supply-capacity model's headline output.
Measured in FLOP per year. The 2024 base-scenario value is roughly 3.97e+28 FLOP/year; the 2040 base value is 1.65e+31 FLOP/year.
This is not the same as one frontier training run's FLOP. A frontier run consumes some share (currently a small fraction) of one year's total usable compute.
Theoretical compute¶
The compute capacity implied by raw chip count × peak FLOP/s × seconds
per year, before applying constraints (power, DC, capex) or utilization
derating. The supply model emits theoretical_compute_flop_year as an
intermediate quantity.
Effective compute¶
A future quantity (not yet implemented). Effective compute will be raw FLOP adjusted upward for algorithmic-efficiency gains: each generation of architectures, training recipes, and post-training improvements lets a fixed FLOP budget produce a more capable model. Without an effective-compute layer, two runs at the same FLOP count in different years are treated as equivalent — which is wrong.
H100-equivalent¶
A common unit for accelerator counts. One H100-equivalent is one NVIDIA H100 GPU's worth of FP16/BF16 dense FLOP/s (roughly 989 TFLOP/s). Used to normalize across hardware generations and vendors: B200 chips count as more than 1 H100-equivalent each; older A100s count as less.
The supply model tracks h100_equivalent_shipments and
installed_stock_h100e rather than physical chip counts because the
H100-equivalent unit absorbs perf-per-chip improvements naturally.
FLOP/year¶
The rate of floating-point operations per calendar year. The natural
unit for total annual compute, including the supply-capacity model's
usable_compute_flop_year and theoretical_compute_flop_year.
This is not the same as FLOP-per-training-run, which has no time dimension. A 1e25-FLOP training run might use 1e25 FLOP total over a multi-month training period; a 1e29-FLOP/year fleet does that much work continuously across all running jobs.
Training compute¶
Compute consumed during pretraining (and fine-tuning, in some
accountings) of a model. The historical baseline's primary measurement.
The allocation layer will need to split annual usable_compute_flop_year
into training vs inference vs other uses.
Inference compute¶
Compute consumed serving a trained model in production. Roughly the opposite allocation slice from training compute. Currently inference-vs-training split is unmodeled; the allocation layer will introduce it.
AI R&D experiment compute¶
Compute consumed by experiments that aren't a single canonical training run: ablations, evals, safety testing, scaling-law experiments, post-training-recipe sweeps. Often ~10–30% of frontier-lab compute budgets historically; the allocation layer will track this.
Post-training compute¶
Compute consumed in RLHF, RLAIF, fine-tuning passes, and other processing steps applied to a base model. Sometimes counted under training, sometimes broken out. The allocation layer should keep this as a separate slice.
Capex¶
Capital expenditure — the dollars spent on physical AI infrastructure
(chips, servers, networking, data-center shells, power infrastructure).
The supply model treats annual ai_infrastructure_capex_usd as a
constraint that competes with chips and power for binding status.
The supply-model base case has 2024 capex at ~$210B and 2030 at ~$1.5T/year (global, AI-dedicated).
Capex constraint¶
The capex-limited stock = cumulative capex flow ÷ installed-cost-per-H100e. Asks: "if all available capex were spent on chips and clusters, how many H100-equivalents could that buy?" Binds early in the supply projection horizon (2024–2036 in the base case).
Chip constraint¶
The chip-limited stock = installed H100-eq stock with linear retirement. Asks: "ignoring all other limits, how many H100-eq exist on earth?" Binds late in the base supply scenario (2037–2040) and throughout the chip-bottleneck scenario.
Power constraint¶
The power-limited stock = AI-dedicated MW × AI share / per-chip effective power draw (chip × server overhead × PUE). Asks: "given the AI-DC power capacity, how many H100-eq can it support?" Does not bind in our base scenarios — power efficiency gains keep up with stock growth.
Data-center constraint¶
The DC-limited stock = AI-dedicated MW × dc_packing_efficiency /
per-chip effective power. Conceptually distinct from power because it
captures cooling, slot density, transformer/switchgear, and permitting
slack — not just raw grid-power MW. Binds throughout the
power_datacenter_bottleneck scenario.
Honest caveat: in the current model, both power and DC constraints are derived from the same
ai_dc_mwinput scaled by different multipliers. They are conceptually distinct but mathematically coupled. A future refactor could split them into truly independent inputs (grid_mw_available_for_aivscommissioned_dc_slot_mw); seedocs/supply_findings.md§8.
Binding constraint¶
The constraint that's actually limiting installed stock in a given year:
argmin(chip_limit, power_limit, dc_limit, capex_limit). The supply
model emits binding_constraint as a categorical column per year per
scenario. The model's base case has capex binding 2024–2036, then chip.
Allocation layer¶
The next component to be built. Splits annual usable_compute_flop_year
across training / inference / AI R&D / post-training / reserves, and
produces largest_frontier_run_flop_by_year as the bridge between
total annual compute and single-training-run compute.
Largest-run concentration¶
The fraction of training compute (or of total usable compute) that goes to the single largest training run in a year. A small but critical parameter: changing it from 5% to 15% changes the implied 2030 frontier-run FLOP by ~3×. The allocation layer will model this explicitly.
Task horizon¶
A future quantity. The longest task duration (in human-hours) at which an AI system performs reliably. Capability-mapping research from METR and similar groups expresses progress as a doubling time of the task horizon. Currently unmodeled in this project; the capability layer will introduce it.