Our Methodology

How we built the
Hiring Practices Assessment

A research-backed model that converts how consistently you run 12 hiring practices into an estimated probability of hiring success.

The most expensive coin flip in your company

Roughly half of hires don't work out. Leadership IQ's study of 20,000 hires found that 46% of new employees fail within 18 months — and 89% of those failures were attitudinal (coachability, emotional intelligence, motivation, temperament), not lack of skill. Studies of executive hiring reported in Harvard Business Review put failure rates for senior roles at 40–60%.

At Series A/B, that miss rate is brutal. Every hire is a meaningful fraction of the team, the burn, and the culture. Yet most startups hire the way they always have: smart people having unstructured conversations and comparing gut feelings.

The good news is that hiring is one of the best-researched decisions in organizational science. A century of selection research — synthesized in Schmidt & Hunter's landmark 1998 review and revised by Sackett and colleagues in 2022 — measured how well each hiring practice actually predicts on-the-job performance. The practices are known. The question is how consistently you run them.

The 12 practices, across four stages

The assessment audits 12 practices grouped into the four stages of a hiring decision. Each practice is measured by two statements about how consistently it happens at your company — from Never (1) to Always (6).

1 · Define

Decide what you are hiring for before you meet a candidate.

  • Success factors defined per role
  • Competencies mapped to the role
  • Structured interview plans

2 · Assess

Gather evidence that predicts the job, not the interview.

  • Behavioral questions over hypotheticals
  • Work samples & skills assessments
  • Interviewer training & calibration

3 · Decide

Combine evidence consistently instead of trading impressions.

  • Anchored rating rubrics
  • Independent scoring & calibrated debriefs
  • Structured reference checks

4 · Land

Convert the decision into an accepted offer and a successful first year.

  • Candidate experience & closing
  • Hiring metrics & feedback loops
  • Onboarding handoff (30/60/90)
An honest distinction: the Define, Assess, and Decide practices carry weight because they raise selection validity — how well your process predicts who will succeed. The Land practices work differently: they protect the outcome you already earned, by making sure your top choice accepts, your process learns, and the first 90 days convert a good decision into a successful hire. They carry smaller weights for exactly that reason.

How the probability model works

The whole model is four rules. Every number in your report comes from them.

1. The baseline is 50%. That's the observed success rate of instinct-driven, unstructured hiring — consistent with Leadership IQ's 46%-fail-within-18-months finding and reported executive-hire failure rates of 40–60%. Throughout, "success" follows the research definition: the hire is performing and retained roughly 18 months in.

2. The ceiling is 85% — never 100%. For a funnel running all 12 practices, we model a composite operational validity of roughly .60–.70. That composite is our modeling assumption, built up from Sackett et al.'s corrected estimates (structured interviews ρ ≈ .42) plus partial incremental validity from work samples, structured references, and mechanical score combination. Using Taylor & Russell's 1939 conversion — with a ~50% base rate and a typical 10–20% selection ratio — that corresponds to roughly 82–87% of hires succeeding. We cap it at 85%. No process eliminates hiring risk, and we won't pretend otherwise.

3. The 35-point gap is split across the 12 practices in proportion to each practice's validity evidence (table below). The weights sum to exactly 35. Overlapping practices are handled in the weights, not double-counted: the interview-structure family — structured plans, behavioral questions, rating rubrics, and independent scoring, 15 points combined — collectively represents the validity gain from unstructured (ρ ≈ .19) to structured (ρ ≈ .42) interviewing. A validity of .19 is a correlation with later job performance, not a success rate — the 50% baseline and 85% ceiling come from outcome studies, never from reading validities as percentages.

4. Your adoption of each practice scales its contribution. Each practice's score is the mean of its two items (1–6). Adoption = (score − 1) ÷ 5, giving a value from 0 to 1. Your estimated probability = 50 + the sum of every practice's weight × adoption. Answer Never to everything and you sit at exactly 50%. Answer Always to everything and you reach exactly 85%.

PracticeStageWeight (pts)Primary research anchor
Structured interview plansDefine5.0Sackett et al. (2022): structured interviews ρ ≈ .42 vs ≈ .19 unstructured — the largest single lever; McDaniel et al. (1994); Huffcutt & Arthur (1994)
Behavioral questions over hypotheticalsAssess4.0Taylor & Small (2002); Janz (1982); Huffcutt et al. (2001)
Work samples & skills assessmentsAssess4.0Sackett et al. (2022): work samples ρ ≈ .33; Roth, Bobko & McFarland (2005)
Success factors defined per roleDefine3.0Campion, Palmer & Campion (1997); Schmidt & Hunter (1998)
Competencies mapped to the roleDefine3.0Wiesner & Cronshaw (1988); Campion et al. (1997)
Anchored rating rubricsDecide3.0Campion et al. (1997); Melchers et al. (2011)
Independent scoring & calibrated debriefsDecide3.0Kuncel et al. (2013): mechanical combination outperforms holistic judgment
Interviewer training & calibrationAssess2.5Huffcutt & Woehr (1999); Campion et al. (1997)
Structured reference checksDecide2.5Sackett et al. (2022): ρ ≈ .23; Schmidt & Hunter (1998)
Onboarding handoff (30/60/90)Land2.0Bauer et al. (2007); Leadership IQ (2005)
Candidate experience & closingLand1.5Hausknecht et al. (2004): applicant reactions shape offer acceptance
Hiring metrics & feedback loopsLand1.5Local validation principle; no direct meta-analytic anchor — its modest weight reflects that
Total35.0

A worked example. Say your company runs every practice about half the time — each practice scores 3.5 out of 6, so adoption is 0.5. Your estimate is 50 + (35 × 0.5) = 67.5, displayed as 68%. Now look at one practice: if your structured-plans items score 2 and 3, that practice's score is 2.5, adoption is 0.3, so it contributes 1.5 of its 5.0 points — leaving 3.5 points on the table. That headroom is what ranks your improvement opportunities.

Displayed probabilities are rounded to whole percentages and per-practice contributions to one decimal, so the parts may not sum exactly to the headline numbers. The displayed gap is always the difference between the two displayed percentages.

What this is — and isn't. This is an estimate built from population-level research, not a measurement of your company. It can't see your talent brand, your market, or your judgment. What it can do is show you, with a century of evidence behind it, which practices raise the odds and how much room you have left on each one.

Research foundation

Every weight in the model is anchored to published selection research. The key sources:

  • Sackett, Zhang, Berry & Lievens (2022) — Revisiting meta-analytic estimates of validity in personnel selection
    The revised meta-analysis of selection method validity, correcting earlier range-restriction assumptions. Establishes structured interviews (ρ ≈ .42) as the strongest single predictor, with work samples (ρ ≈ .33) and structured references (ρ ≈ .23) among the strongest supplements.
  • Schmidt & Hunter (1998) — The validity and utility of selection methods in personnel psychology
    The landmark synthesis of 85 years of selection research, establishing the validity hierarchy of hiring methods and the economic utility of better selection.
  • Taylor & Russell (1939) — The relationship of validity coefficients to the practical effectiveness of tests in selection
    The classic tables converting a predictor's validity, the base rate of success, and the selection ratio into the proportion of selected candidates who succeed — the mechanism behind our 50% baseline and 85% ceiling.
  • Campion, Palmer & Campion (1997) — A review of structure in the selection interview
    Defines the 15 components of interview structure — job-analysis grounding, consistent questions, anchored rating scales, and more — that underpin the Define and Decide practices.
  • Kuncel, Klieger, Connelly & Ones (2013) — Mechanical versus clinical data combination in selection and admissions decisions
    Meta-analysis showing that combining interview ratings mechanically substantially outperforms holistic judgment — the basis for scores-first, discussion-second debriefs.
  • Huffcutt & Woehr (1999); Taylor & Small (2002); Hausknecht, Day & Thomas (2004); Bauer et al. (2007)
    Interviewer training effects on validity; past-behavior versus situational question formats; applicant reactions and offer acceptance; and structured onboarding's effect on new-hire performance and retention.
  • Leadership IQ (2005) — Why new hires fail
    Study of 20,000 hires: 46% fail within 18 months, and 89% of failures are attitudinal — coachability, emotional intelligence, motivation, temperament — not technical skill. The anchor for the 50% baseline.

What you get

Results are immediate. Your full report appears the moment you finish, and a complete copy — every practice, every contribution, your headline numbers — is emailed to you with a permanent link.

  • Your estimated probability of hiring success — current versus the 85% best-practice ceiling
  • A stage-by-stage radar of your hiring process across Define, Assess, Decide, and Land
  • All 12 practices scored, with each one's exact contribution to your probability and the points it leaves on the table
  • Your top three opportunities, ranked by probability points available, each with a specific "Try this" action
  • A downloadable PDF report and an emailed copy you can share with your leadership team

Ready to see
your number?

24 questions, about 8 minutes. Your full report is immediate.