B3 — Mission alignment signals

Context

MATS cares about whether applicants are working on AI safety for the right reasons. The 10.0 application captured this in three ways: a Theory-of-Change (ToC) ranking alignment score (0–100) measuring whether the applicant's ranking of AI risks matched a reference ordering; AIS engagement multi-select fields listing courses/programs/orgs; and duration (free text) of how long they've engaged with AI safety. Do any of these actually predict who gets ranked by a stream?

MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).

How the 10.0 selection pipeline worked (click to expand)

The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:

Stage 1 — submitted background / experience / motivation, picked which research tracks they were interested in (Empirical, Policy & Strategy, Technical Governance, Theory, Compute Infrastructure). An LLM screen filtered out applicants who clearly didn't meet a minimum bar, and produced advisory per-stream recommendations.
Stage 2 — applicants who passed Stage 1 had their materials scored by LLM-graded rubrics. The empirical track used a composite score combining Research Skills, Technical Execution (split into MLE, SWE, Math sub-scores), and Soft Skills. The top ~600 by composite advanced to Stage 3.
Stage 3 — applicants chose specific mentors / "streams" to apply to. Each stream reviewed its applicants and produced a ranked list. Top-ranked applicants got offers; lower-ranked got waitlisted. ~120 offers were made.

For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.

Outcome definitions (click to expand)

Outcome definitions used throughout these analyses:

is_ranked (primary outcome) — applicant was ranked by ≥1 stream. This is the cleanest signal of "the selection process picked this person." Not the same as "received an offer" — offer count is bounded by cohort size (~120), but rank count reflects quality independently of capacity.
is_invited_to_worktest (secondary outcome) — applicant was engaged by ≥1 stream in any way: invited to a work test, invited to an interview, ranked, or sent the Megastream takehome. Strict superset of is_ranked. One level above is_ranked in the funnel.
passed_mentors_bar — applicant was offered or waitlisted. In 10.0, this equals is_ranked exactly (every ranked person got either an offer or a waitlist slot).

Why this matters

The 11.0 planning includes 'mission-alignment correlation' as an explicit workstream. Chris Ackerman's memo on graduating-fellow AI safety filtering argues that we should be filtering more aggressively on this; this analysis is partial evidence for or against that.

A prior analysis (May 2026, 11.0 application design): Pearson r(ToC score, is_ranked) = +0.139, p=5e-11. Quintile P(ranked) spread: Q1 = 3.8%, Q5 = 15.1% (~4× spread). We reproduce here.

⚠️ Caveat — AIS form bug in 10.0

The 10.0 application form for AI safety engagement had a UI bug: the secondary detail panels were swapped. When an applicant selected research program in the main multi-select, the form opened the structured course detail panel (and vice versa). This means the secondary detail fields — which specific programs / courses an applicant listed — are unreliable for any applicant who didn't select BOTH research program AND structured course in the main multi-select.

This analysis only uses the main multi-select count and the duration field, both of which are unaffected by the bug. But any downstream analysis that tries to use the specific-program or specific-course detail fields will need to filter to applicants who selected both flags in the main multi-select.

Headline

ToC alignment score has a modest but reliable association with is_ranked in the full pool: AUC = 0.646 [0.607, 0.688] (n = 2,199). The quintile spread is ~4× from bottom to top — broadly consistent with the May 2026 prior analysis.

In the Stage-3 empirical pool (range-restricted), the ToC signal attenuates substantially (AUC 0.612 [0.562, 0.661]). Most of the full-pool effect comes from the Stage-1 → Stage-2 → Stage-3 gating: low-ToC applicants disproportionately don't reach Stage 3 in the first place.

AIS engagement count and duration carry less signal — both AUCs sit near 0.5 in the full pool and barely above chance in Stage 3.

Sample	Predictor	Outcome	n	AUC	95% CI
Full 10.0 pool	ToC alignment score	`is_ranked`	2,199	0.646	[0.607, 0.688]
Full 10.0 pool	ToC alignment score	`is_invited`	2,199	0.586	[0.558, 0.613]
Full 10.0 pool	AIS engagement count	`is_ranked`	2,203	0.525	[0.501, 0.547]
Full 10.0 pool	AIS engagement count	`is_invited`	2,203	0.526	[0.510, 0.541]
Full 10.0 pool	AIS prior duration (years)	`is_ranked`	1,235	0.597	[0.549, 0.646]
Full 10.0 pool	AIS prior duration (years)	`is_invited`	1,235	0.560	[0.525, 0.595]
Stage-3 empirical	ToC alignment score	`is_ranked`	791	0.612	[0.562, 0.661]
Stage-3 empirical	ToC alignment score	`is_invited`	791	0.553	[0.514, 0.592]
Stage-3 empirical	AIS engagement count	`is_ranked`	791	0.518	[0.494, 0.542]
Stage-3 empirical	AIS engagement count	`is_invited`	791	0.512	[0.492, 0.534]
Stage-3 empirical	AIS prior duration (years)	`is_ranked`	470	0.573	[0.510, 0.632]
Stage-3 empirical	AIS prior duration (years)	`is_invited`	470	0.510	[0.461, 0.566]

Quintile	n	Mean ToC	P(ranked)	P(invited)
Q1	449	40.5	3.8%	15.8%
Q2	438	57.9	5.3%	18.5%
Q3	440	69.7	8.4%	26.1%
Q4	449	81.2	10.5%	26.9%
Q5	423	94.8	15.1%	30.7%

B3 — Do AI-safety engagement signals predict ranking?

Context

Headline

AUC summary

ToC quintile breakdown

Takeaways