
Open-Ended Discovery · Neural Cellular Automata · Emergent Dynamics
1FLAIR, University of Oxford · 2ELLIS Institute Tübingen · 3University of Freiburg · 4Prior Labs
PBT-NCA is a meta-optimization framework for evolving Petri Dish Neural Cellular Automata under novelty-driven competitive pressure. Instead of collapsing into static order or noise, the system continually discovers new lifelike behaviors—ranging from coordinated motion and scattering to colonization and symbiotic partitioning. By rewarding diversity within and across worlds, PBT-NCA sustains open-ended dynamics at the edge of chaos.
Without any handcrafted targets, PBT-NCA spontaneously evolves a diverse ecosystem of structures—shooters, gliders, amoebas, colonies, and spaceships—each arising from pure multi-agent competition.
Fluid, shape-shifting macro-structures that migrate across the substrate with coordinated, differentiated cell behavior—mirroring motility-induced phase separation.
Directed projectile emission from stable territorial clusters—analogous to glider-gun structures in classical CA literature.
Persistent traveling waves that self-replicate across the grid without a fixed template—a universal marker of open-ended CA systems.
A macroscopic entity (red agent) emits a small cluster of cells across the substrate, executing spatial colonization via a decentralized replicating strategy.
Decentralized foraging-like behavior: agents follow stigmergic-style pheromone trails emerging purely from local NCA gradient dynamics.
Local interactions producing highly structured, periodically replicating entities with internal substructure, reminiscent of the computational richness long associated with cellular automata.
A restless mosaic of lifeforms that pulse, fragment, and reclaim territory in real time, forming an archipelago-like pattern.
This animated figure recreates the composite novelty score plot from the paper: the smoothed population novelty score grows across meta-iterations while representative emergent worlds appear at the moments they enter the evolutionary record.
A meta-optimization loop that transforms standard population-based training into an open-ended regime discovery engine by replacing stationary fitness with novelty-driven selection pressure operating at two timescales.
Each of the P = 30 worlds is rolled out for Tw inner steps. Agents update via gradient-based learning while competing on the shared grid. Trajectories are scored by the dual-novelty fitness.
Top-m behavioral descriptors—species occupancy statistics (μ, σ, δ, entropy, alive-mass change)—are appended to a bounded FIFO archive. Archive novelty is computed as k-NN distance (k = 8) in descriptor space.
Each frame is encoded by a frozen DINOv2 encoder. Per-world diversity is the median cosine distance to all other worlds at the same timestep, averaged over the rollout—rewarding novel morphology beyond what handcrafted descriptors capture.
Every K meta-iterations, the lowest-fitness worlds are replaced by Lamarckian copies of elite parents: weights, optimizer state, and ecological context are inherited, then crossover, mutation, and Gaussian weight perturbation are applied.








Clone weights, optimizer state, and context.
Combine strong lineages during exploit.
Alter hyper parameters and world-level choices.
Apply Gaussian noise to diversify behavior.
Spawn new worlds to replace the discarded ones.





If you find this work useful, please cite the paper: