C7 — How has the applicant pool changed across cohorts?

Context

Beyond gender/race (C6), how has the applicant pool changed across cohorts? This analysis tracks pool size, education level, and currently-enrolled status across 6.0 → 10.0. These shifts matter because changes in downstream outcomes (ranking rates, mentor evals, publications) are partly explained by who's applying — not just by how we're selecting.

MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).

How the 10.0 selection pipeline worked (click to expand)

The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:

  1. Stage 1 — submitted background / experience / motivation, picked which research tracks they were interested in (Empirical, Policy & Strategy, Technical Governance, Theory, Compute Infrastructure). An LLM screen filtered out applicants who clearly didn't meet a minimum bar, and produced advisory per-stream recommendations.
  2. Stage 2 — applicants who passed Stage 1 had their materials scored by LLM-graded rubrics. The empirical track used a composite score combining Research Skills, Technical Execution (split into MLE, SWE, Math sub-scores), and Soft Skills. The top ~600 by composite advanced to Stage 3.
  3. Stage 3 — applicants chose specific mentors / "streams" to apply to. Each stream reviewed its applicants and produced a ranked list. Top-ranked applicants got offers; lower-ranked got waitlisted. ~120 offers were made.

For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.

Outcome definitions (click to expand)

Outcome definitions used throughout these analyses:

Notes on cross-cohort comparison

Pool size

Cohort Completed applications
6.0 1,184
7.0 878
8.0 1,454
9.0 1,296
10.0 2,210

Education-level distribution

Level 6.0 7.0 8.0 9.0 10.0
No college 2% 2% 2% 2% 0%
Bachelor's 44% 38% 36% 35% 35%
Master's 32% 34% 33% 36% 32%
Doctoral 20% 24% 27% 28% 8%

Currently enrolled

Takeaways

  1. Pool size has grown substantially — 10.0 (2,210) is the biggest cohort, ~2× the size of 7.0. Stage-1 funnel filters that worked in 7.0 don't trivially extend to 10.0 at the same proportional pass rates.
  2. Education composition is slowly trending toward less formally credentialed applicants. The Bachelor's share has grown across cohorts; Doctoral share has shrunk modestly. This is consistent with MATS broadening its applicant base.
  3. Currently-enrolled share (where measurable) has stayed in a fairly narrow band — student applicants are a stable fraction.
  4. For 11.0: expect 11.0 to be at least as large as 10.0. Pool composition is changing slowly enough that 10.0-calibrated rubrics should remain ~appropriate, but monitor for further shifts (especially if 10.0's process changes the kinds of people who apply).
🔧 Debug — how the data was interpreted (click to expand; safe to skip)

Sample. All cohorts (apps_6 through apps_10). Filtered to completed applications. n varies by cohort.

Outcome variable(s). N/A — descriptive composition.

Predictor fields. Standardized columns from data.py: education_level (4 levels), currently_enrolled (bool). 6.0 has no enrollment column; 10.0 derives it from an explicit question.

Filters applied. Completed applications only.

Missing-data handling. Per-column listwise where applicable; null education_level kept as separate (not shown in stacks).

Key assumptions / caveats.