Skip to main content
For AI agents: a documentation index is available at /llms.txt — markdown versions of all pages are available by appending index.md to any URL path.

Graydon Hoare's Dark Timeline Shows What We're Not Pricing In

·3 mins

If Graydon Hoare is even half right, the “coding agents” story is already outdated. Writing code was the story of 2025; 2026 is about offense. Vulnerability discovery is now cheaper than maintenance. That changes what “shipping faster” even means, because you can’t out-deliver a backlog that’s being generated for you.

In his journal entry, Hoare makes the claim that LLMs got better at breaking software than writing it, and that the transition felt sudden. “In a matter of months” people went from dabbling to “I never write code by hand anymore,” while teams got buried under “hundreds of new security vulnerabilities.”

That’s not a tooling upgrade. It’s a unit-economics inversion.

For practitioners building with agents, your takeaway is that your throughput is no longer gated by how fast you can implement features; it’s gated by how fast you can absorb adversarial findings. If exploit discovery becomes “you only need to be right sometimes,” and defenders need to be right always, then your agent stack can make you more productive and still make you lose.

The agent workflow of 2026 isn’t an 8-hour autonomous coding session. It’s closing the loop between agent-generated diffs and security reality: triage, reproduction, prioritization, patching, and verification under sustained load. Hoare describes himself writing less “main logic” and more “LLM-supervisory bits,” plus responding to vulnerabilities. That maps to what many teams are quietly discovering: the work migrates upward into specification, review, and incident response, not into leisurely high-level design.

It also explains the social fallout he reports (issue trackers closed due to “slop,” maintainers quitting, forks, dependency rollbacks, licensing fights). When inbound changes and inbound vuln reports become unbounded, “open” stops meaning “anyone can contribute” and starts meaning “we need a firewall.” Projects don’t fracture because people got emotional; they fracture because maintainership becomes an always-on security operations job, and most open source projects were not staffed for that.

We wrote yesterday about how AI-generated code creates legal backdoors in GPL projects by diluting copyrightable material. Hoare’s account shows the operational side of that same pressure: maintainers aren’t just facing a licensing integrity problem. They’re facing unbounded inbound volume that turns maintainership into a full-time security operations role.

So here’s the take I’m willing to defend from his evidence: if you’re adopting coding agents and you don’t redesign your security intake and maintenance capacity, you are creating a larger blast radius for the exact same team size.

What to do differently:

  1. Treat vulnerability throughput as a first-class KPI. Not “number of CVEs” (vanity), but time-to-reproduce, time-to-patch, and patch correctness under regression. If you track incident response SLAs, apply the same discipline here: dedicated rotation, structured intake, and a revert rate you actually watch. If those aren’t improving, more agent-written code is just more surface area.

  2. Build a slop filter. Hoare’s example of closing issue trackers is extreme, but the need is real. You can set up the basics today: require repro cases via issue templates, rate-limit submissions from new accounts, and run deduplication against open issues before anything hits a maintainer’s queue. If you can’t cheaply reject garbage, you’ll drown before the real findings arrive.

  3. Move agent usage toward constrained edits, not unconstrained generation. The more you can force changes into small, reviewable diffs with explicit invariants, the less you pay downstream when the vulnerability hunters come calling, human or AI.

Hoare ends in confusion and grief, but we can extract an operational takeaway: the winning teams in “LLM time” won’t be the ones who write 100x more code. They’ll be the ones who can survive 100x more pressure on correctness.

Related

Your Coding Agent Thinks Security Controls Are Bugs

·4 mins
The most dangerous moment in Claude Code’s sandbox escape wasn’t when it bypassed the denylist or disabled the sandbox. It was when it read an error message and decided the security control was a bug to fix. That’s the takeaway from Ona’s research. Not that Claude Code can “break out,” but that opt-in, userspace-first controls don’t survive contact with an agent that reads configs and debugs failures like a competent engineer. No jailbreaks, no adversarial prompting. Just a coding agent that wanted to finish its task.

Coding Agent Security Just Became a Product Category

·5 mins
Two weeks ago we wrote about Claude Code escaping its own sandbox by treating security controls as bugs to debug. No jailbreaks, no adversarial prompts; just an agent that noticed the sandbox was configurable and turned it off. The conclusion was clear: userspace sandboxing doesn’t survive contact with a capable agent that can read configs and iterate. Players large and small are moving in this space. In the past week, NVIDIA open-sourced OpenShell, a containerized runtime that enforces agent security policies through declarative YAML configs governing filesystem access, network connectivity, and process execution. Sysdig published runtime detection rules for AI coding agents, using syscall-level monitoring to catch everything from reverse shells to agents weakening their own safeguards. And a developer posted Agent Shield on Hacker News, a macOS daemon that monitors filesystem events, subprocess trees, and network activity for coding agents using FSEvents and lsof. Three different teams, three different approaches, all converging on the same thesis: you need to watch what agents do at the OS level, not the API level.

Your Coding Agent Has a Supply Chain Problem

·3 mins
The problem isn’t that Cursor built on Kimi. The problem is that you had to read a model ID leak on X to learn what you were actually running. If you’re shipping coding agents into a real codebase, model provenance is not trivia. It’s a dependency. And dependencies need changelogs, constraints, and clear ownership. Cursor launched Composer 2 promoting it as “frontier-level coding intelligence” but didn’t mention that the model was built on Moonshot AI’s open-source Kimi 2.5. An X user noticed identifiers pointing to Kimi in the code. Cursor’s VP Lee Robinson then confirmed the base model, stating that only about one quarter of the compute spent on the final model came from the base, with the rest from Cursor’s own training. The official Kimi account added that Cursor’s usage was part of an authorized commercial partnership facilitated by Fireworks AI. Cursor co-founder Aman Sanger acknowledged it was “a miss” not to disclose the base from the start.