Ants as Agents
- Cynthia Unwin

- 60 minutes ago
- 3 min read

Today I learned a new word. Stigmergy. It's a good word. Stigmergy is a form of indirect communication where agents coordinate their actions by modifying their shared environment, leaving traces (like pheromones or digital markers) that influence the subsequent behavior of other agents, creating complex, self-organized systems without central control.
I learned this word when I was reading about ants. I was reading about ants because I wanted to know what the complex behavior of ant colonies could tell us about building agents. Not necessarily agents that are managing a specific business process, but agents that are tasked with maintaining a specific state within a complex system. I've been wondering, if when we think about AIOps and the architectures that will make us successful, if we need less orchestration and more choreography?
I thought of ants because ants, like AIOps agents exist in a dynamic environment where they can only have knowledge of local states. They are highly attuned to information that they need to perform their functions and are very focused on and persistent in working towards the overall colony's goals.
Ant colonies accomplish spectacular engineering feats: they construct elaborate underground cities with climate-controlled chambers, ventilation shafts, and waste management systems; they build living bridges from their own bodies to span gaps, dynamically reinforcing weak points as traffic patterns shift; they wage coordinated military campaigns with scout units, pheromone-based intelligence networks, and tactical flanking maneuvers; they practice agriculture by cultivating fungus gardens, managing crop diseases, and maintaining antibiotic-producing bacteria to protect their yields; some species even herd aphids like livestock, "milking" them for honeydew and relocating their herds to fresh grazing areas. All of this complexity emerges from millions of tiny individuals —none of whom understand the colony-level goals they work together to achieve. Their intelligence doesn't reside in the individual but in the system itself.
By software architecture standards ants follow a choreography pattern. They don't have a central orchestrator and they coordinate their actions through stigmergy. Stigmergy is more efficient than direct messaging because no resources are consumed and no lag is incurred by managing or updating a central coordinator; ant important information is persisted in the environment (using pheromones), meaning outdated signal strength naturally degrades over time; and multiple ants can read the same signal simultaneously without contention. Because ant to ant communication is largely completed by leaving markers in the environment success paths get reinforced with more pheromone information, while pheromones evaporate from negative paths reducing their reinforcement.
Ant colonies solve complex problems, but individual ants simply respond to local conditions without needing to have an understanding of the global objective. This allows overall ant systems to adapt to changing conditions without re-planning, individual ant failures have minimal impact on the overall system and you can scale by adding more ants.
Ants don't assign or accept tasks, instead different ants have different response thresholds for various stimuli. Some ants respond strongly to food scarcity, others to brood care needs etc. The ant workforce self-organizes based on the stimuli in the environment. (Ants also respond with some randomness which I think is a very interesting point that we will discuss at a different time.) Most ant species are actually monomorphic (another great word that means all the worker ants are fundamentally physically the same) and roles can and do change based on colony needs.
For AIOps I see the ant colony model being exceedingly interesting. Ant based agents don't need to be brilliant, they just need to respond appropriately to events. Ant based agents can adapt to unforeseen changes in the environment and communicate information without complex messaging systems. The ant based model can tolerate localized failure and remain remarkably resilient.
At a high-level, the practical implications of this for building AIOps agent systems is that we should be:
Focusing on event-driven architectures where agents respond to observability signals (metrics, logs, traces) deposited in shared systems rather than orchestrated workflows.
Using federated agent pools where agents (anomaly detectors, root cause analyzers, remediation agents) self-organize around incidents based on signal strength.
Using shared communication channels that illustrate pheromone-like messages where validation by multiple agents strengthens signals, and lack of verification allows signals to fade away.
Working towards emergence-based problem solving where agents follow local policies concerned with threshold violations, correlation patterns, and remediation heuristics that will allow the system to self manage complex situations.
The sophistication we need in AIOps may not come from building smarter individual agents or more complex orchestration—it could very well emerge from creating the right environment for simple agents, using task specific context and models, to coordinate effectively. Perhaps we need to start building a colony.




Comments