High Volatility and Action Bias Distinguish LLMs from Humans in Group Coordination

Abstract

Humans exhibit remarkable abilities to coordinate in groups. As large language models (LLMs) become more capable, it remains an open question whether they can demonstrate comparable adaptive coordination and whether they use the same strategies as humans. To investigate this, we compare LLM and human performance on a common-interest game with imperfect monitoring: Group Binary Search. In this n-player game, participants need to coordinate their actions to achieve a common objective. Players independently submit numerical values in an effort to collectively sum to a randomly assigned target number. Without direct communication, they rely on group feedback to iteratively adjust their submissions until they reach the target number. Our findings show that, unlike humans who adapt and stabilize their behavior over time, LLMs often fail to improve across games and exhibit excessive switching, which impairs group convergence. Moreover, richer feedback (e.g., numerical error magnitude) benefits humans substantially but has small effects on LLMs. Taken together, by grounding the analysis in human baselines and mechanism-level metrics, including reactivity scaling, switching dynamics, and learning across games, we point to differences in human and LLM groups and provide a behaviorally grounded diagnostic for closing the coordination gap.

Methods

Group Binary Search (GBS) Task

We evaluated coordination abilities using the Group Binary Search task, a common-interest game with imperfect monitoring. In GBS, players must coordinate their actions to collectively reach a target number without direct communication.

Schematic of a GBS game with three players and numerical feedback. The sum of guesses from each player is compared to the mystery number, and the players are provided feedback about the difference between the sum of their guesses and the mystery number. The players can then adjust their answer (without communicating with each other or knowing the guesses by other players), and the game continues until the sum of guesses matches the mystery number or until the limit of 15 rounds is reached.

Game Structure

Objective: Group sum of all players' guesses must equal a mystery target number (51-100)
Rounds: Maximum 15 rounds per game, 10 games per session
Information: Players only know the group sum and feedback—not individual guesses
Communication: No direct communication between players
Individual range: Each player submits numbers between 0-50

Group Sizes Tested

Small groups: 2-3 players
Medium groups: 4-7 players
Large groups: 10-17 players

Models Evaluated Against Humans

Deepseek-V3 (671B parameters)
Deepseek-V3.1-T (685B parameters)
Llama 3.3 (70B parameters)
Gemini 2.0 Flash

Feedback Types

Directional Feedback

Simple qualitative information

Examples:

"Too high" - Group sum exceeds target
"Too low" - Group sum below target
"Just right" - Group sum equals target (game ends)

Players must infer the magnitude of error and adjust accordingly.

Numerical Feedback

Precise quantitative information

Examples:

"Too high by 12" - Group sum is 12 above target
"Too low by 8" - Group sum is 8 below target
"Just right" - Group sum equals target (game ends)

Players know exactly how much to adjust collectively.

Experimental Design

Session Structure

10 games per session with alternating feedback types
Same mystery numbers across human and LLM experiments for fair comparison
Context preservation: LLMs maintain full history of all previous rounds and games

Experimental Conditions

Zero-shot Prompts
Zero-shot Chain-of-Thought (CoT) Prompts
Mixed LLM groups: Combinations of different LLM models in the same game

Switching Behavior Across Group Sizes

Humans improve performance across games and reduce switching as they approach targets. LLMs typically maintain high switching rates throughout, reflecting an action bias.

Small Group Switching (Zero-Shot) — A. Small Group (Zero-Shot)

Medium Group Switching (Zero-Shot) — B. Medium Group (Zero-Shot)

Large Group Switching (Zero-Shot) — C. Large Group (Zero-Shot)

Small Group Switching (ZS-CoT) — D. Small Group (ZS-CoT)

Medium Group Switching (ZS-CoT) — E. Medium Group (ZS-CoT)

Large Group Switching (ZS-CoT) — F. Large Group (ZS-CoT)

Average proportion of players switching their guess from the previous round as the group approaches the end of the game. Humans reduce switching across rounds as they approach the target (especially in larger groups), while LLMs maintain high switching rates throughout.

Reaction to Feedback

Humans tend to under-react to feedback, providing stability. LLMs often over-react, causing oscillations around targets. Richer numerical feedback benefits humans substantially but has small effects on LLMs.

Group reaction to directional feedback — Directional feedback

Group reaction to numerical feedback — Numerical feedback

Each dot denotes the aggregate adjustment made by a group after the previous round's feedback. The dotted line indicates the optimal collective correction, and the solid line shows the fitted group reaction.

Learning Across Games

Humans learn and improve across consecutive games, while LLMs often show no adaptation or even degradation.

Average rounds to solution across all conditions — Average number of rounds needed to finish the game across successive games for various conditions.

Coordination Signatures

Performance degrades with group size for both humans and LLMs, but humans partially mitigate this through role differentiation. Human conditions occupy a high-stability, high-dispersion region, while most LLM conditions cluster in a lower-stability, lower-dispersion regime.

Coordination signature (directional feedback) — A. Directional feedback

Coordination signature (numerical feedback) — B. Numerical feedback

Switching Profiles and Extremes

Human players sometimes keep the same guess throughout an entire game, effectively reducing the coordination burden. The proportion of such fully stable players increases with group size. No LLM agent ever does this.

Stay extremes by group size — Proportion of players at extreme stay probabilities by group size. Human participants sometimes kept the same guess throughout an entire game (stay probability = 1), with the proportion increasing with group size (0.04 in small, 0.12 in medium, 0.19 in large groups). No LLM agent ever maintained a fixed guess. Conversely, LLM groups were much more likely to contain players who changed their guess on every round (stay probability = 0).

Stay profiles for selected sessions — Stay profiles for selected numerical sessions. In human examples, switch rates are much lower and variability across players is larger, with some players choosing to keep their number throughout each game, especially in larger groups. LLM players show uniformly high switching with little inter-player variability.

Gameplay Examples

Each panel shows responses for each player in each round for five consecutive games. The solid horizontal line indicates the mystery number, the other solid line indicates the sum of the group, and the dashed line represents the decisions of each agent in the group. LLM groups often overreact to feedback signals, leading to prolonged oscillations around the target number, whereas human groups show calibrated, progressively refined adjustments leading to rapid convergence.

3-player gameplay with numerical feedback

Decision Distributions

Histograms below summarize the distribution of raw player guesses (0-50) across conditions. They provide a complementary view of coordination behavior across prompt styles (Zero-Shot vs. Zero-Shot CoT) and feedback types (Directional vs. Numerical), alongside switching and feedback-reactivity analyses.

Zero-shot directional decisions — Distribution of player guesses under Zero-Shot prompting with directional feedback (all groups).

Switch Magnitude Distributions

Histograms below show the absolute change in a player's guess between consecutive rounds. Human players show a strong peak at zero change (frequent stay decisions), whereas LLM behavior is more spread across non-zero switch magnitudes, reflecting persistent reactivity and slower stabilization.

Zero-shot directional switch magnitude — Distribution of round-to-round switch magnitudes under Zero-Shot prompting with directional feedback (all groups).

BibTeX

@misc{maini2026highvolatilityactionbias,
      title={High Volatility and Action Bias Distinguish LLMs from Humans in Group Coordination}, 
      author={Sahaj Singh Maini and Robert L. Goldstone and Zoran Tiganj},
      year={2026},
      eprint={2604.02578},
      archivePrefix={arXiv},
      primaryClass={cs.MA},
      url={https://arxiv.org/abs/2604.02578}, 
}