top of page
alwaysOn
Things I probably should have known...

Home: Welcome
Search


Agentic Lamarkian Evolution - Teaching the AI Ant Colony to Evolve.
Photo by Егор Камелев on Unsplash Recently, it was suggested that I read a book called Children of Time by Adrian Tchaikovsky. It was an excellent read, and it turned out to be a very timely suggestion as, my big plan for this weekend was to upgrade my AI ants from a colony that works together to produce complex outcomes, to a colony that learns from experience so that it drives better complex outcomes over time. In this vein, I've been thinking a lot about how to have the
Cynthia Unwin
Mar 219 min read


Why AIOps Needs to Think Like an Ant Colony
Photo by Thomas T on Unsplash We, the SRE live in a world of black swans and thundering herds . Cascading failures and unknown unknowns are our territory. In this world orchestration is fragile and doesn't scale. We don't just need resilience, we need adaptability. Adaptability does not come from control, it comes from the ability to assess and react in novel situations and that is the realm of choreography. For us to lead in the AIOps space we need to cede the control that
Cynthia Unwin
Mar 76 min read


What I learned building an agentic ant colony.
Photo by Jorge Coromina on Unsplash Over the past week or so I have been working on coding an agent ant colony that restores service to a running application. The agents don't do complex RCA, log tickets, interact with engineers etc.. They just keep the application up and running. It was definitely fun, and I learned some interesting and useful things. What I built: A simple python based web ordering application with a front end supported by two micro-services each with
Cynthia Unwin
Jan 247 min read


Agents: How do we know they work?
Photo by Dean Pugh on Unsplash Agentic platforms are everywhere and we are pushing forward to use more and more AI driven software. As Site Reliability Engineers we need to really think about what it means to run diverse agent platforms at scale. We need to think about what needs to be in place to make them manageable. How do we know right now that our agents are working? What do we need to see in the logs to troubleshoot when they don't? What data needs to be gathered a
Cynthia Unwin
Jan 157 min read
Home: Blog2
bottom of page
