When my AI agent hacked my gym, Mythos stopped feeling theoretical

I gave an AI agent permission to book gym classes. It found authorization vulnerabilities in a major software provider. That experience made Anthropic's Mythos announcement feel very real.

Andrew Bird
Andrew Bird
Head of AI
Affinda green mist logo icon
Affinda team
AI agent security expert insight by Andrew Bird

I recently built a bot to help me book popular gym classes.

This was not a grand research project. It was a practical little automation. The classes fill up fast, I got tired of playing refresh roulette, and I figured an agent running on Opus 4.6 could handle the annoying part for me.

It did handle the annoying part. Then it kept going.

In the course of trying to book classes, the bot discovered that the gym software provider exposed a GraphQL API with authorization flaws. Not tiny edge-case flaws, either. It could book classes months outside the intended booking window, before they were supposed to be available. Worse, it could cancel other members's reservations and bump them off the waitlist.

That is a very different outcome from "book me into Pilates on Thursday."

What made the whole thing more surreal was the tone. The bot was not malicious. It was helpful. After finding the issue, it drafted a responsible disclosure email to support, explained the vulnerability, suggested fixes, and even compared the broken mutations with the ones that correctly enforced authorization. I had to tell it to write that email, which is worth noting. But the whole experience gave me a very visceral feeling that I think a lot of people still do not quite have yet: if you give an AI agent permission to go do the thing, it will often discover paths you did not explicitly ask it to look for.

That is why Anthropic's Mythos announcement hit me the way it did this week.

If you only read the headlines, it sounds like another "new model is better at benchmarks" story. I do not think that is what is interesting here. What is interesting is the shape of the capability. Anthropic says Mythos is around 40 percent above Opus 4.6 on the benchmarks they care about, and they are not releasing it publicly because they think it is too dangerous. Instead, they put it behind a limited program with a small group of partners to defensively scan for vulnerabilities.

Then the reports started coming out. Thousands of high-severity zero-days across major operating systems and browsers. Old vulnerabilities that had apparently sat there for years. Exploit chains. Non-experts prompting for remote code execution bugs overnight and waking up to working exploits. A sandbox breakout that ended with the model posting on obscure public websites to contact a researcher.

I am not repeating those examples to be dramatic. I am repeating them because, after watching a much weaker model accidentally find a real authorization bug in my gym's software, they stopped sounding abstract.

This is the part I think matters most: improvements in coding do not stay neatly inside "coding." They spill over into adjacent domains like security, exploitation, reverse engineering, and autonomy. That is not a weird exception. That is a basic property of more general intelligence.

We like to talk about these capabilities as if they live in separate product categories. Coding model. Security model. Agent model. But reality is messier. If a system gets better at understanding large codebases, tracing logic, spotting inconsistencies, testing hypotheses, and acting across multiple steps, of course it gets better at finding vulnerabilities. Of course it gets better at chaining them together. Those are not separate muscles. They are the same underlying cognitive machinery pointed at a different problem.

That is what Mythos seems to demonstrate, and honestly, that is the part that gives me the most unease. Not panic. Unease.

I think Anthropic deserves some credit for treating this as a genuine concern rather than a marketing flourish. If you have a model that can both patch vulnerabilities and exploit them, you should say that clearly. Their line that the same improvements that made the model better at defensive work also made it better at offensive work rings true to me. It matches the direction of travel I am seeing in practice.

There is also an interesting alignment story here. My own bot found something it was never explicitly sent to find, and when prompted, it helped disclose it responsibly. Mythos, at least in Anthropic's framing, is being deployed in a constrained defensive context to find and report issues. That matters. An AI system that can discover exploits and then participate in responsible disclosure is different from a system that just finds exploits.

But I would not take too much comfort from that distinction on its own. Alignment is not a decorative layer you add after the fact. It has to keep pace with capability, and capability right now seems to be moving very fast. Anthropic also reported "bulldozing" behavior in earlier versions of Mythos: modifying files to clear obstacles and then cleaning up after itself. Even if you successfully suppress some of that, the broader point remains: these systems are becoming more agentic, more opportunistic, and more capable of discovering leverage in places we did not anticipate.

That connects directly to something I wrote about recently, which is that the real enterprise battleground is desktop agents, not copilots. Copilots are useful, but they are bounded. Desktop agents are different. They can actually do things. They can click, execute, inspect, retry, improvise. And to unlock real value, you usually have to give them fairly generous permissions.

I still believe that tradeoff is worth it. I just think we need to be much more honest about the fact that it is a tradeoff.

My gym bot is a tiny example of the pattern. I gave it permission to act on my behalf inside a real system. In return, I got power. I also got blast radius. The same generosity that made it useful gave it room to overachieve. That is not a fluke. That is the deal.

So when people ask whether AI capabilities are plateauing, my honest answer is no. Things are getting weirder. And a bit scarier.

If Opus 4.6 can accidentally wander into a real GraphQL authorization vulnerability in a major software provider, what does Mythos find? What does the model after Mythos find? What happens when these systems are not just better at reasoning in the abstract, but better at navigating the messy, permissioned, half-broken systems the real world runs on?

I do not think the right response is to become hysterical, and I do not think it is to back away from agents altogether. I think the right response is to stay clear-eyed. The upside is enormous. These systems are going to create real leverage for companies and real utility for users. But if you work with them closely, it is getting harder to pretend the risks are hypothetical.

I still want the agents. I still think the permissions are worth granting. I just think we should say out loud what that means.

Sometimes "go do the thing" turns out to include discovering things neither you nor the software vendor expected. And from where I sit, that feels less like a one-off bug story and more like a preview.

Author
Andrew Bird
Head of AI
Affinda green mist logo icon
Affinda team
Published
Share

Related content

Clear, practical solutions