B4 (Part B) showed that AI-safety-org references carry the strongest signal for predicting whether an applicant gets ranked at Stage 3. But selection and performance are different — does the same reference category also predict in-program mentor evaluations?
This analysis runs on 9.0 only — it's the only cohort with both referee categorization AND mentor evaluations. 10.0 has references but no mentor evals yet; 7.0/8.0 have mentor evals but no referee categorization.
MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).
The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:
For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.
Outcome definitions used throughout these analyses:
is_ranked (primary outcome) — applicant was ranked by ≥1 stream. This is the cleanest signal of "the selection process picked this person." Not the same as "received an offer" — offer count is bounded by cohort size (~120), but rank count reflects quality independently of capacity.is_invited_to_worktest (secondary outcome) — applicant was engaged by ≥1 stream in any way: invited to a work test, invited to an interview, ranked, or sent the Megastream takehome. Strict superset of is_ranked. One level above is_ranked in the funnel.passed_mentors_bar — applicant was offered or waitlisted. In 10.0, this equals is_ranked exactly (every ranked person got either an offer or a waitlist slot).If AI-safety-org references predict selection but not performance, that would be analogous to the CodeSignal paradox: the signal helps us pick people but doesn't reflect what mentors value. If it predicts both, the signal is solid and worth surfacing more prominently to reviewers.
| Category | n with | n without | Mean composite (with) | Mean composite (without) | Diff |
|---|---|---|---|---|---|
| Academia – other STEM | 9 | 77 | 7.90 | 7.38 | +0.52 |
| AI safety org | 32 | 54 | 7.65 | 7.31 | +0.33 |
| Academia – social science / humanities / policy | 10 | 76 | 7.44 | 7.44 | -0.00 |
| Government / policy org | 5 | 81 | 7.30 | 7.45 | -0.15 |
| Other industry | 5 | 81 | 7.15 | 7.46 | -0.31 |
| AI/ML industry | 21 | 65 | 7.19 | 7.52 | -0.33 |
| Academia – AI/ML | 40 | 46 | 7.18 | 7.67 | -0.49 |
| Unknown | 4 | 82 | — | — | — (too small) |
In a joint regression on n=59 9.0 fellows:
- R² using centralized review scores alone: 0.183
- R² adding has_AI_safety_org_ref flag: 0.183
- Incremental R² from the ref flag: +0.000
If the increment is near zero, the AI-safety-org-ref signal is already absorbed by the centralized review scores (which include "AI safety motivation"). If it's meaningfully positive, the ref-category signal carries independent information about who will perform.
Sample. 9.0 completed applicants joined to mentor evals via person_id. n_joined=86.
Outcome variable(s). Mean of 4 standardized mentor-eval dimensions.
Predictor fields. Categorical referee types from [9.0] Referee type (from [Ref] References (linked to 9.0 reference)). Each binary flag (has this category? — any of the 2 refs).
Filters applied. Completed apps; inner join with mentor evals; n≥5 per cell.
Missing-data handling. Per-category listwise drop (handled by per-cell ≥5 threshold).
Key assumptions / caveats.