10.0 moved to a more centralized, partially-blinded review process. Did that change demographic outcomes? We track gender and race composition and outcomes across cohorts 7.0–10.0. (6.0 didn't collect demographics, so it's excluded.)
MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).
The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:
For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.
Outcome definitions used throughout these analyses:
is_ranked (primary outcome) — applicant was ranked by ≥1 stream. This is the cleanest signal of "the selection process picked this person." Not the same as "received an offer" — offer count is bounded by cohort size (~120), but rank count reflects quality independently of capacity.is_invited_to_worktest (secondary outcome) — applicant was engaged by ≥1 stream in any way: invited to a work test, invited to an interview, ranked, or sent the Megastream takehome. Strict superset of is_ranked. One level above is_ranked in the funnel.passed_mentors_bar — applicant was offered or waitlisted. In 10.0, this equals is_ranked exactly (every ranked person got either an offer or a waitlist slot).| Cohort | Gender | n | n passed | P(passed) [95% CI] |
|---|---|---|---|---|
| 10.0 | Man | 918 | 76 | 8.3% [6.7%, 10.2%] |
| 10.0 | Non-binary | 25 | 1 | 4.0% [0.7%, 19.5%] |
| 10.0 | Woman | 430 | 30 | 7.0% [4.9%, 9.8%] |
| 7.0 | Man | 377 | 31 | 8.2% [5.9%, 11.4%] |
| 7.0 | Non-binary | 14 | 2 | 14.3% [4.0%, 39.9%] |
| 7.0 | Woman | 126 | 14 | 11.1% [6.7%, 17.8%] |
| 8.0 | Man | 597 | 42 | 7.0% [5.2%, 9.4%] |
| 8.0 | Non-binary | 20 | 2 | 10.0% [2.8%, 30.1%] |
| 8.0 | Woman | 228 | 12 | 5.3% [3.0%, 9.0%] |
| 9.0 | Man | 484 | 45 | 9.3% [7.0%, 12.2%] |
| 9.0 | Non-binary | 11 | 2 | 18.2% [5.1%, 47.7%] |
| 9.0 | Woman | 205 | 21 | 10.2% [6.8%, 15.2%] |
Sample. Per cohort: completed applicants with non-null gender / race. 8.0 / 9.0 use [pre] demographic columns; 7.0 uses base demographic columns; 10.0 uses [stage-1-demographics]. All standardized via data.py → gender and race columns.
Outcome variable(s). passed_mentors_bar (proxy for 7.0/8.0; true for 9.0/10.0).
Predictor fields. N/A — descriptive cross-tabs.
Filters applied. Completed applications only. Per-group n≥5 (gender) / n≥10 (race) threshold.
Missing-data handling. Per-cell listwise drop. Missing race in 9.0 is documented.
Key assumptions / caveats.