Beyond gender/race (C6), how has the applicant pool changed across cohorts? This analysis tracks pool size, education level, and currently-enrolled status across 6.0 → 10.0. These shifts matter because changes in downstream outcomes (ranking rates, mentor evals, publications) are partly explained by who's applying — not just by how we're selecting.
MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).
The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:
For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.
Outcome definitions used throughout these analyses:
is_ranked (primary outcome) — applicant was ranked by ≥1 stream. This is the cleanest signal of "the selection process picked this person." Not the same as "received an offer" — offer count is bounded by cohort size (~120), but rank count reflects quality independently of capacity.is_invited_to_worktest (secondary outcome) — applicant was engaged by ≥1 stream in any way: invited to a work test, invited to an interview, ranked, or sent the Megastream takehome. Strict superset of is_ranked. One level above is_ranked in the funnel.passed_mentors_bar — applicant was offered or waitlisted. In 10.0, this equals is_ranked exactly (every ranked person got either an offer or a waitlist slot).No college covers 'no college' and 'high school' (6.0/7.0/8.0/9.0 didn't separate; 10.0 has both).currently_enrolled question. We collapse to the four-level education_level standardized column.| Cohort | Completed applications |
|---|---|
| 6.0 | 1,184 |
| 7.0 | 878 |
| 8.0 | 1,454 |
| 9.0 | 1,296 |
| 10.0 | 2,210 |
| Level | 6.0 | 7.0 | 8.0 | 9.0 | 10.0 |
|---|---|---|---|---|---|
| No college | 2% | 2% | 2% | 2% | 0% |
| Bachelor's | 44% | 38% | 36% | 35% | 35% |
| Master's | 32% | 34% | 33% | 36% | 32% |
| Doctoral | 20% | 24% | 27% | 28% | 8% |
Sample. All cohorts (apps_6 through apps_10). Filtered to completed applications. n varies by cohort.
Outcome variable(s). N/A — descriptive composition.
Predictor fields. Standardized columns from data.py: education_level (4 levels), currently_enrolled (bool). 6.0 has no enrollment column; 10.0 derives it from an explicit question.
Filters applied. Completed applications only.
Missing-data handling. Per-column listwise where applicable; null education_level kept as separate (not shown in stacks).
Key assumptions / caveats.