A1 showed the composite predicts ranking. This analysis zooms in: within the Stage-3 empirical pool, how does outcome probability change with composite percentile? Specifically — is there a floor below which almost no one gets ranked? Is there a ceiling above which additional composite stops helping? Or is the relationship roughly linear?
Practical use: if the curve has a sharp floor, we can treat composite as a hard filter (anyone below X percentile is auto-rejected and doesn't waste reviewer time). If it's smoothly linear, composite is best treated as an informational signal that streams weigh alongside their other judgment.
MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).
The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:
For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.
Outcome definitions used throughout these analyses:
is_ranked (primary outcome) — applicant was ranked by ≥1 stream. This is the cleanest signal of "the selection process picked this person." Not the same as "received an offer" — offer count is bounded by cohort size (~120), but rank count reflects quality independently of capacity.is_invited_to_worktest (secondary outcome) — applicant was engaged by ≥1 stream in any way: invited to a work test, invited to an interview, ranked, or sent the Megastream takehome. Strict superset of is_ranked. One level above is_ranked in the funnel.passed_mentors_bar — applicant was offered or waitlisted. In 10.0, this equals is_ranked exactly (every ranked person got either an offer or a waitlist slot).The relationship between composite percentile and ranking is concave with a strong floor. At decile D1, P(ranked) first crosses 5%. Below that, composite is essentially a hard floor — no upside to including those applicants in further review. The biggest single-decile jump in P(ranked) is D4→D5 (+26.7 percentage points). The jump from D9→D10 adds another +2.8 pp.
| Decile | n | Mean percentile | P(invited) | P(ranked) | P(offered) |
|---|---|---|---|---|---|
| D1 | 58 | 5.03 | 50.0% | 19.0% | 19.0% |
| D2 | 59 | 15.16 | 39.0% | 6.8% | 6.8% |
| D3 | 56 | 25.17 | 51.8% | 30.4% | 30.4% |
| D4 | 62 | 35.53 | 33.9% | 1.6% | 1.6% |
| D5 | 53 | 45.49 | 62.3% | 28.3% | 28.3% |
| D6 | 61 | 55.37 | 52.5% | 16.4% | 16.4% |
| D7 | 53 | 65.25 | 62.3% | 26.4% | 26.4% |
| D8 | 58 | 74.87 | 67.2% | 27.6% | 27.6% |
| D9 | 57 | 84.95 | 73.7% | 35.1% | 35.1% |
| D10 | 58 | 94.97 | 70.7% | 37.9% | 37.9% |
Sample. Stage-3 empirical pool with a non-null percentile field (n=575). Percentile uses the pre-computed [stage-3-empirical] Empirical composite score percentile — rank within the Stage-3 empirical pool.
Outcome variable(s). is_ranked, is_invited_to_worktest, passed_mentors_bar.
Predictor fields. [stage-3-empirical] Empirical composite score percentile — continuous 0–1.
Filters applied. Stage-3 empirical filter (same as A1). Canonical dedup.
Missing-data handling. Listwise drop on percentile.
Key assumptions / caveats.