Saved in:
| Main Author: | |
|---|---|
| Format: | Recurso digital |
| Language: | |
| Published: |
Zenodo
2026
|
| Online Access: | https://doi.org/10.5281/zenodo.18717202 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- <p>We present DigiSoup, a zero-training agent that uses thermodynamic perception and bio-inspired heuristics to play multi-agent social dilemmas on DeepMind's Melting Pot benchmark. DigiSoup<br> uses no neural networks, no reward optimisation, and no training of any kind — actions are selected by a stack of priority rules driven by entropy gradients, temporal growth rates, and <br> spatial memory, implemented in approximately 350 lines of NumPy. </p> <p> Despite this simplicity, DigiSoup outperforms DeepMind's trained reinforcement learning baselines in aggregate on Clean Up — a complex public goods dilemma requiring collective action —<br> scoring 22% above ACB and 46% above VMPO across 9 scenarios (30 episodes each, 95% confidence intervals reported). On CU_7, just two DigiSoup focal agents among seven players score 234.00<br> versus ACB's 120.41 (+94%).</p> <p> The key mechanism is a thermodynamic depletion signal: when the entropy growth rate drops to zero (dS/dt ≤ 0), the agent infers that the shared resource is depleted and diverts to public<br> goods maintenance — without any reward signal indicating that cleaning is beneficial.</p> <p> Code and full results: https://github.com/matthewfearne/digisoup</p>