Eighty percent. — Hisham Thabet

Roughly four of every five lines of code I ship are written or directly shaped by Claude Code or Codex. The number is approximate, the measurement is imprecise, attribution is murky, and the line between “AI wrote this” and “I wrote it after AI proposed it” is not always meaningful. Still, the number is approximately right, and it has held steady for roughly the last eighteen months.

The public conversation on this is mostly marketing or denial. The marketing version: AI as productivity multiplier, X-times faster, the engineers who don’t adopt will be replaced. The denial version: AI as autocomplete, fine for boilerplate, dangerous in production, the engineers who adopt are taking shortcuts. Neither matches the daily reality of running a production system with AI as the primary surface. A third version is mostly unwritten.

What follows is my daily reality.

What eighty percent means

The figure is not a setting on a tool. It is the share of code commits that originated in an agent conversation rather than in an editor. A typical session looks like this. A problem arrives: a bug report, a feature request, a design decision. I open an agent session, paste in or reference the relevant code, describe the problem in prose, and let the agent propose first. We iterate. I refine. I push back. The agent proposes the next revision. Eventually a working version emerges that is not only plausible but articulate enough for production. I read it, run it, sometimes refactor a section that doesn’t fit the existing patterns or mental model, and commit.

The twenty percent is the work that doesn’t fit that shape: configuration spelunking, incident response in the first ninety seconds, the kind of small mechanical edit that’s faster to do by hand than to describe. And spec-stage thinking: the decisions that determine what we’re building before any code is written. The eighty/twenty split is about implementation, not direction.

I find that whiteboarding the thinking (pen, paper, no agent) is the faster path for fast-twitch reasoning. A bidirectional LLM conversation introduces cognitive bloat that an outliner does not.

The session anchor

The single most important piece of the methodology is a project-level context file. In every agent session for PRIVATEBYTE, the first attached file is CLAUDE.md, a working record of the system: conventions, gotchas, incident history, field-name traps. It is currently around two thousand lines, accreted and mutated over eighteen months. It is the difference between a session that ships and a session that hallucinates.

The file is updated on every meaningful discovery. When a WHMCS field returns a userid that is actually a tblusers.id, the gotcha goes in. When a SolusVM2 endpoint differs from the documented one, the gotcha goes in. The file is treated as part of the codebase, not as documentation. It’s reviewed, kept current, deduplicated. Without it, agents are generalists with internet knowledge. With it, agents are operating on the actual system in front of them.

A useful analogy: the file is the map. The terrain (discoveries made turn by turn during the session) is what the map cannot contain. Both are necessary; neither is sufficient.

Failure modes

Agents are wrong with regularity, in specific patterns. The patterns are predictable enough to be worth naming.

Plausible-but-wrong refactors. Agents are biased toward “cleaner” implementations. When the existing code has a peculiar pattern that exists for a non-obvious reason (a runtime guard, a workaround for an upstream bug, a deliberate trade-off), agents will propose to remove it, sometimes aggressively, without checking its uses elsewhere in the codebase. Defense: the CLAUDE.md, which documents the reasons. Without context, every weird-looking piece of code is a target.

Field-name confidence. Agents will use field names that look right but are wrong: userid when the response field is clientid, host when the actual key is hostname. Defense: any new external-API integration requires a live read of the actual API response before code is written. The rule lives in the CLAUDE.md.

Quiet over-engineering. Asked to add a small feature, an agent will sometimes propose a refactor of the surrounding code first. The refactor is usually defensible in isolation; it is also usually out of scope. Defense: explicit instruction at the start of the session: “do not refactor; add only the requested change.” Restate when the temptation returns.

Stale knowledge. The training data has a cutoff. For libraries that have changed since, an agent will write code against the older API. Defense: when in doubt, ask the agent to verify the surface with a read of the live documentation. Most of the time, the agent will catch its own error if asked.

What AI cannot yet do

The boundary conditions are real and worth being honest about.

Spec-stage decisions. What to build, in what order, for which customer segment, with which trade-offs. AI can help structure the conversation but cannot make the call. The product-shaped intuition is mine. The engineering follows from it.

The first ninety seconds of an incident. When an alert fires, the loop is too fast for a prose-mediated conversation. I am tailing logs, restarting services, checking dashboards. AI joins once the immediate fire is out and the diagnostic phase starts.

Cross-system reasoning under partial information. When a bug spans WHMCS, the portal, the proxy gateway, and a payment processor, the diagnostic chain requires holding all four systems in working memory simultaneously. An agent can hold one or two. The rest I hold, pattern-matched into memory over months of operating these systems together.

Taste. What makes a system feel right to operate. The shape of an API. The naming of things. The cadence of releases. AI proposes; I decide.

Why methodology, not productivity claim

The “eighty percent” figure is uncomfortable to share because it sounds like a productivity boast. It is not. It is a description of working pattern, and the working pattern requires more discipline, not less. The CLAUDE.md is a substantial maintenance burden. Every failure mode above is caught by a human-shaped check. Every commit is read, run, and reviewed.

What I ship is faster than I would ship alone. It is also more carefully documented than I would document alone, because the documentation is part of the methodology. The two compound.

The honest summary is this: AI is the primary engineering surface, but the responsibility is undivided.