I’ve spent a lot of time in rooms with teams that aren’t working. Sometimes it’s a leadership team where decisions don’t stick. Sometimes it’s a project team where the same issues keep resurfacing in different forms. And almost always, by the time someone calls me, an investment has already been made — usually in the leader. Coaching, a 360, a leadership program. Often more than one.
In my work with organizations, I’ve often found that these investments don’t lead to meaningful improvements in team effectiveness or culture. It’s not because the training is poorly designed. The problem is more fundamental: We keep trying to improve teams by working on the people who lead them, while treating the team itself as something downstream that will improve as a result. Research has been telling us for some time that this isn’t quite how it works.
Why Most Workplace Measurement Misses the Team
Most organizations have no shortage of measurement: engagement surveys, pulse checks, manager 360s, performance scorecards, culture dashboards. And yet when a team is genuinely struggling, none of it gives us much insight into what’s really going wrong.
That’s because the tools we rely on weren’t built to measure team effectiveness.
Engagement surveys measure how individuals feel about the organization. A team can score well on engagement and still function poorly in how it makes decisions or handles disagreement. Manager 360s measure individual leader behaviors, which is useful when you’re developing the leader, but tells you very little about how the team around them operates. Performance metrics tell you what got delivered, but not why — and two teams with the same output can have very different levels of connection, support and learning.
When the unit we want to improve is the team, but every tool we use is built to assess individuals, it’s no surprise that team development conversations often stay vague. Organizations invest heavily in team development (team days, strategy days, social events), but rarely measure team effectiveness in a structured or meaningful way.
The PLUS Model: 4 Factors That Distinguish High-Performing Teams
Some of the most useful work in this space has come from researchers like Richard Hackman and Ruth Wageman, Eduardo Salas, and Amy Edmondson, whose have spent their careers asking the same question from different angles: What do high-performing teams do that other teams don’t?
The answers are more consistent than I expected when I first started reading the research seriously. Working in partnership with the University of Newcastle, my colleagues and I synthesized this evidence base, plus learnings from our own decades-long experience working with teams, into about 200 items. We surveyed 500 working professionals, and factor analysis led us to a four-factor model we call PLUS — Purpose, Learning, Unity, and Shared leadership. Each factor reflects a behaviorally grounded construct that predicts team effectiveness. More importantly, each one is observable, measurable and developable.
Purpose is the first. Not the kind that gets printed on a wall, but the kind team members can articulate in similar words to each other on a Tuesday afternoon. When I ask members of a team why they exist and get five different answers, that tells me more than most of their key performance indicators (KPIs). Research on shared mental models, particularly Salas and colleagues, has consistently linked alignment on purpose to better coordination and performance under pressure.
Learning is the second — the willingness and ability to share what’s working, surface what isn’t, and adapt together. In our validation research, this cluster of behaviors emerged as the single strongest predictor of team effectiveness across the sample. It echoes what Edmondson and others have been writing about for two decades, going back to her 1999 study of work teams in a manufacturing setting that established the link between psychological safety, learning behavior and team performance.
Unity is the third — what I’d describe as constructive unity. It includes psychological safety, but it’s broader than that. It’s the ability to disagree productively, give and receive direct feedback, and to be honest about how things are actually going. Google’s Project Aristotle reached a similar conclusion, identifying psychological safety as the strongest differentiator between its highest- and lowest-performing teams.
Shared leadership is the fourth. Teams that distribute decision-making and accountability across members — rather than concentrating both in the manager — tend to be more agile, resilient and sustainable. This is uncomfortable news for organizations that have invested heavily in leader-centric development, but it’s where the evidence points. Meta-analytic research by Wang, Waldman and Zhang found shared leadership to be a strong predictor of team performance, particularly in complex and interdependent work.
Measuring PLUS in Practice
This is where the practical work begins. Measuring team effectiveness well requires a different approach from the engagement and 360 surveys most L&D leaders are used to. Here are a few principles to consider:
The unit of analysis is the team, not the individual. A team-level diagnostic asks every member to rate the team’s behaviors rather than their own. Results are reported at the team level, typically as averages and ranges across the dimensions, never individually. This shifts the conversation from “who is doing what well or poorly” to “what patterns are showing up in how we work together?”
Stakeholders should be included where possible. Teams don’t operate in isolation, and the gap between how a team sees itself and how customers, partners or sponsors experience it often reveals the most useful insights. Multi-source feedback surfaces blind spots that internal-only diagnostics often miss.
The data has to belong to the team. The moment a team thinks a diagnostic is being used to evaluate their manager or feed into performance reviews, candor disappears. But when the team owns the data and is responsible for interpreting it, something different happens. People get curious. They challenge the results. They start having conversations they should have had much earlier.
Whatever instrument you use to diagnose team effectiveness, the question to ask is simple: Does it measure team behavior in a way the team can meaningfully engage with, and does it include feedback from people outside the team as well as within in?
Before and After: Using Diagnostics to Evaluate Team Development
One of the most underused applications of team measurement is also one of the most valuable: measuring a team before an intervention and then measuring again after it.
Most team development programs are evaluated, if they are evaluated at all, through participant satisfaction and self-reported learning. That information is useful but limited. It tells us how people felt about the experience, not whether the team’s behavior has changed.
A pre- and post-diagnostic changes that. By measuring PLUS (or another evidence-based framework) before the intervention, designing training around the team’s weakest areas and then measuring again six to nine months later, the data reveals what changed, what didn’t and where the team still needs to focus.
Two things tend to happen when L&D teams adopt this approach. First, the program design becomes more targeted. There is little value in running a generic team workshop when the real issue is clearly shared leadership or lack of purpose. Second, the conversation with sponsors becomes much stronger. Instead of asking them to trust the process, you can show them what changed.
This is also where team development fits more credibly into the broader L&D evaluation conversation. Kirkpatrick’s levels have always been difficult to apply to team-level outcomes. A validated team diagnostic gives us a more direct way to measure behavior change where it happens: in the team.
What Makes Measurement Translate Into Change
Measurement alone doesn’t change anything. I’ve seen plenty of teams receive a beautifully produced diagnostic report and continue working exactly as they did before. Here are a few tips that make the difference between insight and actual change:
Specificity matters. “We need to work on communication” is too vague to act on. “Our stakeholders consistently rate us lower than we rate ourselves on raising issues early” gives a team something concrete to address.
Ownership matters more. Teams that interpret and act on their own data — with support, but without being told what to do — engage more deeply and sustain change longer.
And it has to be repeated. Team effectiveness isn’t a project you finish. The teams that improve are the ones that return to the same questions every six to 12 months, look at what has changed and decide what to work on next.
A Reframe for L&D
I’m not suggesting we stop developing leaders. Leaders still need to grow, and strong leadership still matters enormously. But we’ve been quietly assuming that team performance is mostly a reflection of leader capability, and the evidence doesn’t support that as strongly as many of us would like.
A meaningful share of the performance leaders are held accountable for sits in the collective behavior of the team around them. If we’re not measuring or developing that directly, we’re leaving one of the biggest performance levers untouched.
Team development, taken seriously, is its own discipline — with its own research base, its own tools and its own design principles. For those of us in L&D, the next step is to treat it that way and give team development the same rigor we’ve long given individual development.
The good news is that when teams are given the chance to take ownership of how they work together, they usually do.

