How we built the
Hiring Practices Assessment
A research-backed model that converts how consistently you run 12 hiring practices into an estimated probability of hiring success.
The most expensive coin flip in your company
Roughly half of hires don't work out. Leadership IQ's study of 20,000 hires found that 46% of new employees fail within 18 months — and 89% of those failures were attitudinal (coachability, emotional intelligence, motivation, temperament), not lack of skill. Studies of executive hiring reported in Harvard Business Review put failure rates for senior roles at 40–60%.
At Series A/B, that miss rate is brutal. Every hire is a meaningful fraction of the team, the burn, and the culture. Yet most startups hire the way they always have: smart people having unstructured conversations and comparing gut feelings.
The good news is that hiring is one of the best-researched decisions in organizational science. A century of selection research — synthesized in Schmidt & Hunter's landmark 1998 review and revised by Sackett and colleagues in 2022 — measured how well each hiring practice actually predicts on-the-job performance. The practices are known. The question is how consistently you run them.
The 12 practices, across four stages
The assessment audits 12 practices grouped into the four stages of a hiring decision. Each practice is measured by two statements about how consistently it happens at your company — from Never (1) to Always (6).
1 · Define
Decide what you are hiring for before you meet a candidate.
- Success factors defined per role
- Competencies mapped to the role
- Structured interview plans
2 · Assess
Gather evidence that predicts the job, not the interview.
- Behavioral questions over hypotheticals
- Work samples & skills assessments
- Interviewer training & calibration
3 · Decide
Combine evidence consistently instead of trading impressions.
- Anchored rating rubrics
- Independent scoring & calibrated debriefs
- Structured reference checks
4 · Land
Convert the decision into an accepted offer and a successful first year.
- Candidate experience & closing
- Hiring metrics & feedback loops
- Onboarding handoff (30/60/90)
How the probability model works
The whole model is four rules. Every number in your report comes from them.
1. The baseline is 50%. That's the observed success rate of instinct-driven, unstructured hiring — consistent with Leadership IQ's 46%-fail-within-18-months finding and reported executive-hire failure rates of 40–60%. Throughout, "success" follows the research definition: the hire is performing and retained roughly 18 months in.
2. The ceiling is 85% — never 100%. For a funnel running all 12 practices, we model a composite operational validity of roughly .60–.70. That composite is our modeling assumption, built up from Sackett et al.'s corrected estimates (structured interviews ρ ≈ .42) plus partial incremental validity from work samples, structured references, and mechanical score combination. Using Taylor & Russell's 1939 conversion — with a ~50% base rate and a typical 10–20% selection ratio — that corresponds to roughly 82–87% of hires succeeding. We cap it at 85%. No process eliminates hiring risk, and we won't pretend otherwise.
3. The 35-point gap is split across the 12 practices in proportion to each practice's validity evidence (table below). The weights sum to exactly 35. Overlapping practices are handled in the weights, not double-counted: the interview-structure family — structured plans, behavioral questions, rating rubrics, and independent scoring, 15 points combined — collectively represents the validity gain from unstructured (ρ ≈ .19) to structured (ρ ≈ .42) interviewing. A validity of .19 is a correlation with later job performance, not a success rate — the 50% baseline and 85% ceiling come from outcome studies, never from reading validities as percentages.
4. Your adoption of each practice scales its contribution. Each practice's score is the mean of its two items (1–6). Adoption = (score − 1) ÷ 5, giving a value from 0 to 1. Your estimated probability = 50 + the sum of every practice's weight × adoption. Answer Never to everything and you sit at exactly 50%. Answer Always to everything and you reach exactly 85%.
| Practice | Stage | Weight (pts) | Primary research anchor |
|---|---|---|---|
| Structured interview plans | Define | 5.0 | Sackett et al. (2022): structured interviews ρ ≈ .42 vs ≈ .19 unstructured — the largest single lever; McDaniel et al. (1994); Huffcutt & Arthur (1994) |
| Behavioral questions over hypotheticals | Assess | 4.0 | Taylor & Small (2002); Janz (1982); Huffcutt et al. (2001) |
| Work samples & skills assessments | Assess | 4.0 | Sackett et al. (2022): work samples ρ ≈ .33; Roth, Bobko & McFarland (2005) |
| Success factors defined per role | Define | 3.0 | Campion, Palmer & Campion (1997); Schmidt & Hunter (1998) |
| Competencies mapped to the role | Define | 3.0 | Wiesner & Cronshaw (1988); Campion et al. (1997) |
| Anchored rating rubrics | Decide | 3.0 | Campion et al. (1997); Melchers et al. (2011) |
| Independent scoring & calibrated debriefs | Decide | 3.0 | Kuncel et al. (2013): mechanical combination outperforms holistic judgment |
| Interviewer training & calibration | Assess | 2.5 | Huffcutt & Woehr (1999); Campion et al. (1997) |
| Structured reference checks | Decide | 2.5 | Sackett et al. (2022): ρ ≈ .23; Schmidt & Hunter (1998) |
| Onboarding handoff (30/60/90) | Land | 2.0 | Bauer et al. (2007); Leadership IQ (2005) |
| Candidate experience & closing | Land | 1.5 | Hausknecht et al. (2004): applicant reactions shape offer acceptance |
| Hiring metrics & feedback loops | Land | 1.5 | Local validation principle; no direct meta-analytic anchor — its modest weight reflects that |
| Total | 35.0 |
A worked example. Say your company runs every practice about half the time — each practice scores 3.5 out of 6, so adoption is 0.5. Your estimate is 50 + (35 × 0.5) = 67.5, displayed as 68%. Now look at one practice: if your structured-plans items score 2 and 3, that practice's score is 2.5, adoption is 0.3, so it contributes 1.5 of its 5.0 points — leaving 3.5 points on the table. That headroom is what ranks your improvement opportunities.
Displayed probabilities are rounded to whole percentages and per-practice contributions to one decimal, so the parts may not sum exactly to the headline numbers. The displayed gap is always the difference between the two displayed percentages.
What this is — and isn't. This is an estimate built from population-level research, not a measurement of your company. It can't see your talent brand, your market, or your judgment. What it can do is show you, with a century of evidence behind it, which practices raise the odds and how much room you have left on each one.
Research foundation
Every weight in the model is anchored to published selection research. The key sources:
-
Sackett, Zhang, Berry & Lievens (2022) — Revisiting meta-analytic estimates of validity in personnel selectionThe revised meta-analysis of selection method validity, correcting earlier range-restriction assumptions. Establishes structured interviews (ρ ≈ .42) as the strongest single predictor, with work samples (ρ ≈ .33) and structured references (ρ ≈ .23) among the strongest supplements.
-
Schmidt & Hunter (1998) — The validity and utility of selection methods in personnel psychologyThe landmark synthesis of 85 years of selection research, establishing the validity hierarchy of hiring methods and the economic utility of better selection.
-
Taylor & Russell (1939) — The relationship of validity coefficients to the practical effectiveness of tests in selectionThe classic tables converting a predictor's validity, the base rate of success, and the selection ratio into the proportion of selected candidates who succeed — the mechanism behind our 50% baseline and 85% ceiling.
-
Campion, Palmer & Campion (1997) — A review of structure in the selection interviewDefines the 15 components of interview structure — job-analysis grounding, consistent questions, anchored rating scales, and more — that underpin the Define and Decide practices.
-
Kuncel, Klieger, Connelly & Ones (2013) — Mechanical versus clinical data combination in selection and admissions decisionsMeta-analysis showing that combining interview ratings mechanically substantially outperforms holistic judgment — the basis for scores-first, discussion-second debriefs.
-
Huffcutt & Woehr (1999); Taylor & Small (2002); Hausknecht, Day & Thomas (2004); Bauer et al. (2007)Interviewer training effects on validity; past-behavior versus situational question formats; applicant reactions and offer acceptance; and structured onboarding's effect on new-hire performance and retention.
-
Leadership IQ (2005) — Why new hires failStudy of 20,000 hires: 46% fail within 18 months, and 89% of failures are attitudinal — coachability, emotional intelligence, motivation, temperament — not technical skill. The anchor for the 50% baseline.
What you get
Results are immediate. Your full report appears the moment you finish, and a complete copy — every practice, every contribution, your headline numbers — is emailed to you with a permanent link.
- Your estimated probability of hiring success — current versus the 85% best-practice ceiling
- A stage-by-stage radar of your hiring process across Define, Assess, Decide, and Land
- All 12 practices scored, with each one's exact contribution to your probability and the points it leaves on the table
- Your top three opportunities, ranked by probability points available, each with a specific "Try this" action
- A downloadable PDF report and an emailed copy you can share with your leadership team
Ready to see
your number?
24 questions, about 8 minutes. Your full report is immediate.