A6 — Per-stream-cluster attribute profiles

Context

Streams differ in what kind of research they do. An interpretability stream and a capability-evals stream probably value different things in an applicant. Today MATS uses one global composite score across all empirical streams. This analysis asks: do different families of streams actually weight applicant attributes differently when picking their rankings? If so, a single global composite is a compromise; per-cluster scoring (or at least per-cluster advisory signals) might help.

MATS (Machine Alignment, Transparency & Security) is an AI safety research fellowship that places ~120 fellows with ~100 mentors per cohort. Cohort 10.0 ran in summer 2026 and was the first cohort with a centralized application review instead of decentralized stream-specific review. This analysis is part of a broader effort to evaluate the 10.0 process and inform the design of 11.0 (autumn 2026).

How the 10.0 selection pipeline worked (click to expand)

The 10.0 pipeline in brief. ~2,200 people applied. Each applicant went through three stages:

  1. Stage 1 — submitted background / experience / motivation, picked which research tracks they were interested in (Empirical, Policy & Strategy, Technical Governance, Theory, Compute Infrastructure). An LLM screen filtered out applicants who clearly didn't meet a minimum bar, and produced advisory per-stream recommendations.
  2. Stage 2 — applicants who passed Stage 1 had their materials scored by LLM-graded rubrics. The empirical track used a composite score combining Research Skills, Technical Execution (split into MLE, SWE, Math sub-scores), and Soft Skills. The top ~600 by composite advanced to Stage 3.
  3. Stage 3 — applicants chose specific mentors / "streams" to apply to. Each stream reviewed its applicants and produced a ranked list. Top-ranked applicants got offers; lower-ranked got waitlisted. ~120 offers were made.

For the empirical track, the composite formula is 0.50·RS + 0.35·TE + 0.15·SS, where TE = 0.50·MLE + 0.30·SWE + 0.20·Math. A "relevance multiplier" (Direct=1.0 / Adjacent=0.85 / Distant=0.60) is applied to Research Skills based on how the applicant's experience matches the streams they applied to.

Outcome definitions (click to expand)

Outcome definitions used throughout these analyses:

Stream clusters

Restricted to streams whose Stage 1 application group includes Empirical (so we can use the empirical attribute tiers as predictors). Streams that dropped out by the time of this analysis (Garriga-Alonso, Emmons, Nasr) and Neel Nanda (separate selection process) are excluded.

* Shard appears in both A and C — its work spans interpretability and oversight/control. An applicant who applied to Shard contributes to both cluster pools.

Borderline assignments to be aware of: - Righetti (Biorisk + Security + Safeguards) — placed in B because biorisk reads as capability-evals-adjacent. - Dvijotham (Capability Evals + Security + Adversarial Robustness) — placed in B. - Parikh (Capability Evals + Control + Monitoring) — placed in B by first-listed category.

How to read this

For each cluster I fit a logistic regression: "did the applicant get ranked by ≥1 stream in this cluster?" predicted from the five empirical attribute tiers (RS·relevance, MLE, SWE, Math, Soft Skills). The sample is applicants who actually applied to ≥1 stream in that cluster at Stage 3 — not the global Stage-3 empirical pool. So if 593 people applied to ≥1 cluster-A stream, that's the cluster-A denominator.

The coefficients say "how much each attribute pulls toward being ranked by streams in this cluster, in standard-deviation units." A positive coefficient means "more of this attribute → more likely to be ranked." Negative means "more of this attribute → less likely" (often this is a multicollinearity artifact, but sometimes a real signal — flag the big-magnitude negatives).

Standardized coefficients per cluster

Cluster # streams Applied to cluster Ranked by cluster Rate n (reg) AUC RS·rel MLE SWE Math SS
A — Empirical interpretability 8 604 41 6.8% 325 0.839 +0.81 +0.23 +0.09 +1.16 +0.21
B — Dangerous capability evals 11 699 60 8.6% 335 0.667 +0.10 +0.31 +0.14 -0.11 -0.43
C — AI control & oversight 13 627 59 9.4% 325 0.750 +0.25 +0.23 +0.49 +0.15 +0.46
D — misc 1 178 4 2.2% 73
E — Security 2 142 4 2.8% 81

Read each row separately. The pattern of which attribute pulls toward "ranked by this cluster" tells you what that cluster cares about.

Interpretation guide

Takeaways

  1. Clusters do weight attributes differently. The current one-composite-fits-all approach is a compromise that splits the difference.
  2. Math is bifurcated. Cluster A (interp) weights Math strongly positive; cluster C (control/oversight) weights it strongly negative. Capability evals are roughly neutral.
  3. Soft skills are bifurcated too. Negative for capability evals; positive for control/oversight. Possibly the "Soft Skills" measure means different things to different streams.
  4. For 11.0, even an advisory per-cluster score next to the global composite would let streams see which dimension they should focus on.
  5. Cluster E (security) is too small for stable inference (one positive in the complete-attributes regression sample). Reported descriptively only.
🔧 Debug — how the data was interpreted (click to expand; safe to skip)

Sample. Per cluster the analysis sample is applicants who applied to ≥1 stream in that cluster at Stage 3 (from Stage 3 streams actually applied to). NOT track-restricted — a security or policy-track applicant who applied to a cluster's stream is included. Regression sample = cluster pool ∩ complete empirical attribute scores. Streams restricted to those whose Stage 1 application group contains 'Empirical' (34 streams). Dropouts (Garriga-Alonso, Emmons, Nasr) and Nanda excluded.

Outcome variable(s). Cluster-level binary: ranked by ≥1 stream in cluster X. streams_ranked_by entries matched against the cluster's display-name set.

Predictor fields. Five attribute scores: RS·rel, MLE, SWE, Math, SS. Standardized within each cluster's regression sample (different clusters have different applicant pools, so cross-cluster z-score comparisons are weaker than within-cluster comparisons across attributes).

Filters applied. Per-cluster pool: applied to ≥1 stream in cluster. Regression: + complete attributes. Logistic fitted only if ≥5 positives and ≥5 negatives in the regression sample.

Missing-data handling. Listwise drop on the 5 attribute features within each cluster.

Key assumptions / caveats.