Skip to content

Glossary

Definitions of core terms used throughout this project. Where two quantities are similar but not the same, the entries explicitly call out the distinction.


Frontier training run

A single major training run for a frontier AI model — for example, the training run that produced GPT-4, Claude 3 Opus, or Gemini 1.5 Pro. The historical baseline tracks one row per such run from Epoch's "Notable AI Models" dataset.

This is not the same as total annual AI compute. A frontier run consumes only a share of the global AI compute available in its release year.

Frontier training compute

The total floating-point operations performed during a single frontier training run. Stored in absolute FLOP. Typical 2024 frontier runs are in the 1e25–1e26 FLOP range.

This is what the historical baseline measures via training_compute_flop. It is not the same as usable_compute_flop_year (which is the supply-side annual total).

Total usable AI compute

The total amount of AI compute available globally in one year, after accounting for chip stock, power, data-center capacity, capex, and utilization. The supply-capacity model's headline output.

Measured in FLOP per year. The 2024 base-scenario value is roughly 3.97e+28 FLOP/year; the 2040 base value is 1.65e+31 FLOP/year.

This is not the same as one frontier training run's FLOP. A frontier run consumes some share (currently a small fraction) of one year's total usable compute.

Theoretical compute

The compute capacity implied by raw chip count × peak FLOP/s × seconds per year, before applying constraints (power, DC, capex) or utilization derating. The supply model emits theoretical_compute_flop_year as an intermediate quantity.

Effective compute

A future quantity (not yet implemented). Effective compute will be raw FLOP adjusted upward for algorithmic-efficiency gains: each generation of architectures, training recipes, and post-training improvements lets a fixed FLOP budget produce a more capable model. Without an effective-compute layer, two runs at the same FLOP count in different years are treated as equivalent — which is wrong.

H100-equivalent

A common unit for accelerator counts. One H100-equivalent is one NVIDIA H100 GPU's worth of FP16/BF16 dense FLOP/s (roughly 989 TFLOP/s). Used to normalize across hardware generations and vendors: B200 chips count as more than 1 H100-equivalent each; older A100s count as less.

The supply model tracks h100_equivalent_shipments and installed_stock_h100e rather than physical chip counts because the H100-equivalent unit absorbs perf-per-chip improvements naturally.

FLOP/year

The rate of floating-point operations per calendar year. The natural unit for total annual compute, including the supply-capacity model's usable_compute_flop_year and theoretical_compute_flop_year.

This is not the same as FLOP-per-training-run, which has no time dimension. A 1e25-FLOP training run might use 1e25 FLOP total over a multi-month training period; a 1e29-FLOP/year fleet does that much work continuously across all running jobs.

Training compute

Compute consumed during pretraining (and fine-tuning, in some accountings) of a model. The historical baseline's primary measurement. The allocation layer will need to split annual usable_compute_flop_year into training vs inference vs other uses.

Inference compute

Compute consumed serving a trained model in production. Roughly the opposite allocation slice from training compute. Currently inference-vs-training split is unmodeled; the allocation layer will introduce it.

AI R&D experiment compute

Compute consumed by experiments that aren't a single canonical training run: ablations, evals, safety testing, scaling-law experiments, post-training-recipe sweeps. Often ~10–30% of frontier-lab compute budgets historically; the allocation layer will track this.

Post-training compute

Compute consumed in RLHF, RLAIF, fine-tuning passes, and other processing steps applied to a base model. Sometimes counted under training, sometimes broken out. The allocation layer should keep this as a separate slice.

Capex

Capital expenditure — the dollars spent on physical AI infrastructure (chips, servers, networking, data-center shells, power infrastructure). The supply model treats annual ai_infrastructure_capex_usd as a constraint that competes with chips and power for binding status.

The supply-model base case has 2024 capex at ~$210B and 2030 at ~$1.5T/year (global, AI-dedicated).

Capex constraint

The capex-limited stock = cumulative capex flow ÷ installed-cost-per-H100e. Asks: "if all available capex were spent on chips and clusters, how many H100-equivalents could that buy?" Binds early in the supply projection horizon (2024–2036 in the base case).

Chip constraint

The chip-limited stock = installed H100-eq stock with linear retirement. Asks: "ignoring all other limits, how many H100-eq exist on earth?" Binds late in the base supply scenario (2037–2040) and throughout the chip-bottleneck scenario.

Power constraint

The power-limited stock = AI-dedicated MW × AI share / per-chip effective power draw (chip × server overhead × PUE). Asks: "given the AI-DC power capacity, how many H100-eq can it support?" Does not bind in our base scenarios — power efficiency gains keep up with stock growth.

Data-center constraint

The DC-limited stock = AI-dedicated MW × dc_packing_efficiency / per-chip effective power. Conceptually distinct from power because it captures cooling, slot density, transformer/switchgear, and permitting slack — not just raw grid-power MW. Binds throughout the power_datacenter_bottleneck scenario.

Honest caveat: in the current model, both power and DC constraints are derived from the same ai_dc_mw input scaled by different multipliers. They are conceptually distinct but mathematically coupled. A future refactor could split them into truly independent inputs (grid_mw_available_for_ai vs commissioned_dc_slot_mw); see docs/supply_findings.md §8.

Binding constraint

The constraint that's actually limiting installed stock in a given year: argmin(chip_limit, power_limit, dc_limit, capex_limit). The supply model emits binding_constraint as a categorical column per year per scenario. The model's base case has capex binding 2024–2036, then chip.

Allocation layer

The next component to be built. Splits annual usable_compute_flop_year across training / inference / AI R&D / post-training / reserves, and produces largest_frontier_run_flop_by_year as the bridge between total annual compute and single-training-run compute.

Largest-run concentration

The fraction of training compute (or of total usable compute) that goes to the single largest training run in a year. A small but critical parameter: changing it from 5% to 15% changes the implied 2030 frontier-run FLOP by ~3×. The allocation layer will model this explicitly.

Task horizon

A future quantity. The longest task duration (in human-hours) at which an AI system performs reliably. Capability-mapping research from METR and similar groups expresses progress as a doubling time of the task horizon. Currently unmodeled in this project; the capability layer will introduce it.