Our Methodology

How we built the
Team Performance Assessment

A research-grounded diagnostic that helps leaders identify the right problem before trying to fix the wrong one.

The most expensive leadership mistake

Most leaders, when a team is underperforming, jump straight to people. That person isn't strong enough. Those two don't get along. We need better talent.

Sometimes that's true. But more often, it's a misdiagnosis. The real problem is usually structural — unclear goals, ambiguous roles, broken decision-making — and no amount of hiring, firing, or team-building will fix it. You'll replace the people and end up with the same dysfunction, because the system that produced the behavior is still intact.

This is the most expensive mistake a leader can make: solving the wrong problem with confidence.

The Waterline Model

In 1970, organizational psychologist Roger Harrison proposed a deceptively simple idea: organizational problems exist at different depths, and leaders should intervene at a level no deeper than that required to produce enduring solutions.

Harrison showed that as intervention depth increases, so do cost, complexity, and risk. Going deeper than necessary isn't just inefficient — it's destructive. More than fifty years later, Molly Graham built on Harrison's foundation and made it actionable for modern operators. Drawing on two decades at Google, Facebook, the Chan Zuckerberg Initiative, and her work coaching leaders at Stripe, Anthropic, and OpenAI, Graham developed the Waterline Model — a practical framework that translates Harrison's depth principle into four concrete levels a leader can investigate in sequence.

Graham's central insight: blaming people for problems that are actually structural is one of the biggest leadership traps there is. The best leaders aren't better psychologists — they're better designers.

Think of a boat moving across the water toward a goal. When progress stalls, the instinct is to look at the people rowing. The Waterline Model says: look below the surface first.

Operating principle: Snorkel before you scuba. Start at the surface — goals, roles, expectations — where the majority of team problems actually live. Each level you go deeper costs more, takes longer, and carries greater risk of unintended consequences.

The four levels

The assessment examines four levels, ordered from most common (and most fixable) to deepest.

1 · Structure

Goals, roles, expectations, success criteria, and organizational design. Can everyone say what they're working toward, what they personally own, and how success is measured?

2 · Dynamics

How decisions get made, how information flows, how conflict is handled, what behavior gets rewarded. Even with clear goals, teams can be trapped by unstable decisions or bottlenecked information.

3 · Interpersonal

Tension between specific people. Trust issues, unresolved conflict, style clashes. Often caused by broken structure or toxic dynamics — not by the people themselves.

4 · Individual

What's happening inside a single person. Skill gaps, motivation, capacity, personal circumstances. Only diagnose here after the system above them is sound.

Research foundation

The assessment draws on decades of validated organizational research beyond Harrison and Graham. Each construct maps to at least one peer-reviewed body of work.

  • Roger Harrison (1970)Choosing the Depth of Organizational Intervention
    The foundational principle: organizational problems exist at different depths, and effective leaders intervene at the shallowest level that produces enduring results. The Waterline Model's core logic derives directly from Harrison's depth criterion.
  • Molly GrahamHow to Debug a Team That Isn't Working
    Graham operationalized Harrison's framework for modern operators, building on two decades of leadership at Google, Facebook, and the Chan Zuckerberg Initiative. Her Waterline Model and the "snorkel before you scuba" principle form the structural backbone of this assessment.
  • Amy Edmondson (1999)Psychological Safety and Learning Behavior in Work Teams
    Edmondson's 7-item scale, developed at Harvard, measures whether team members feel safe raising concerns, admitting mistakes, and asking for help. This directly informed our Dynamics items. When psychological safety is absent, teams optimize for self-protection over progress — exactly the pattern Graham describes.
  • Gallup Q12 — Employee Engagement Research
    The most widely validated employee engagement instrument: 12 items, tested across 17 million employees over 30+ years. The Q12 shaped our design philosophy in two key ways: every item must be behaviorally anchored, and every item must be actionable — a leader should read a low score and know exactly what to change.
  • Google's Project Aristotle — Team Effectiveness Research
    Google's internal research studied 180+ teams to identify the five dynamics of effective teams: psychological safety, dependability, structure and clarity, meaning, and impact. Our four Waterline levels map directly to the constructs that Project Aristotle found to predict team performance.
  • Team Diagnostic Survey (TDS) — Wageman & Hackman, Harvard
    Developed by Ruth Wageman and J. Richard Hackman at Harvard, the TDS is one of the most comprehensive team-level assessment instruments in organizational research. It reinforced our commitment to measuring the system around the person before evaluating the person themselves.
  • Van Sonderen et al. (2013) — Psychometric Research on Reverse-Coded Items
    This research found no evidence that negatively-worded questions prevent response bias, but strong evidence they increase confusion and reduce measurement reliability. This informed our decision to use only 2 reverse-scored items (out of 22), phrased as conceptual opposites rather than negations — so higher scores always mean healthier functioning.

How scoring works

Each of the 22 items uses a 6-point Likert scale with no neutral midpoint, forcing a directional response. Two items (S6 on priority alignment, I5 on interpersonal tension) are reverse-scored — for these, agreement signals a problem. Both are recoded before scoring so that higher scores always mean healthier functioning.

Results are read top-down — Structure first, always — reinforcing the Waterline Model's core diagnostic sequence.

Strength
≥4.8
This level is working. Maintain it.
Attention
3.6 – 4.7
Issues exist. Investigate with open-ended responses.
Action Required
<3.6
Significant gaps. Fix this level before going deeper.

What you get

Results are immediate. You see your full profile the moment you complete the assessment.

  • Your four level scores with zone classification — Strength, Attention, or Action Required
  • A radar chart showing your team's shape across all four levels at a glance
  • Strengths clustered by level — what's working and worth protecting
  • Prioritized improvements following the snorkel-to-scuba sequence — Structure first, always
  • Open-ended response captures to add qualitative context to the scores
  • A downloadable PDF report you can use in a planning session or share selectively
Want help interpreting results? Petra Coaching works with leaders navigating exactly these challenges. A 60-minute debrief session can turn scores into a clear intervention plan.

Ready to find out what's
actually going on?

Take the assessment in 10 minutes. Get your full profile immediately.