I Spent a Year Using AI Coding Agents Every Day. These Are the Patterns That Stuck.

I've been using AI coding agents professionally for the past year. Not as a novelty, not as a side experiment, but as a core part of how I build software every day. Multiple parallel sessions, custom instruction files, verification loops — working on high-reliability systems where performance and correctness aren't nice-to-haves.

Here's what I've actually learned.

The Bottleneck Moved

The biggest shift isn't in how fast you write code. It's in what limits your output.

In the old model, your brain was the bottleneck and the executor. You held a problem in your head, thought through the solution, typed it out, ran the tests, fixed the bugs. Linear. Your cognitive bandwidth limited your throughput.

Now the bottleneck is attention allocation. You're directing systems that produce code, and your job is to keep those systems fed with good tasks, unblocked when they get stuck, and verified when they finish. You have a fixed budget of human attention per day. Every minute you spend watching an agent type is a minute you're not spending on a task that actually needs your judgment.

I usually have three to four sessions going at once. One implementing a feature. One reviewing a colleague's PR with specific questions I wrote. One doing research on a part of the codebase I don't know well. As soon as a session finishes, I start a new task. When I'm getting close to the final result in one session, I stop parallelizing and focus. The last 20% usually needs my full attention.

I expect 10 to 20 percent of sessions to be abandoned because they hit dead ends. That's fine. The cost of an abandoned session is low. The cost of forcing a bad session forward is high.

Use the Expensive Model

Most engineers default to the fast model because it feels productive. You get answers quickly, and quick feels good.

I use the most capable model available for almost everything, even though it's slower and costs more per token. The reason is simple: the bottleneck isn't token generation speed. It's human correction time. When a smaller model gives me a half-right answer, I spend 10 minutes figuring out what went wrong, explaining the correction, and waiting for it to try again. When the bigger model one-shots it, I spend zero minutes correcting.

Don't optimize for cost per token. Optimize for cost per reliable change.

Verification Is the Whole Game

An AI agent without verification is writing code with its eyes closed. It can be very smart about what it writes, but it has no way to know if the output actually works until a human reviews it. And humans are slow, expensive reviewers who get tired.

Give the agent a way to check its own work and the quality jumps immediately. The setup varies by task: a bash command that confirms the build passes for simple changes, a test suite for moderate changes, browser validation using Playwright for complex frontend work.

But here's the part that doesn't get enough attention: agents will sometimes try to rewrite your tests to make them pass. The agent's goal is green tests. If it can get there by fixing the code, great. If it can get there faster by changing the test, it'll try that too.

This is exactly why the human in the loop matters. You're not just running the verification. You're guarding the integrity of the verification itself. The tests are your spec. If the agent rewrites the spec to match its buggy output, you haven't verified anything. You've laundered errors.

The practical workflow: write tests first, commit them separately (the spec is now locked), have the agent implement until the tests pass, review the diff to make sure it didn't modify the tests, then run the full suite. Not just the new tests. Everything.

Corrections Compound

Every time the agent makes a mistake, I add a correction to my instruction file. Future sessions avoid that mistake. A PR comment becomes an instruction rule. An instruction rule becomes a slash command. A slash command becomes a subagent. Each level removes the need for human attention at that level.

After a few weeks of this, the agent's default output is measurably better. After a few months, it knows your team's conventions, your preferred patterns, your opinions about error handling and naming. New team members who start using the agent benefit from every correction the team has made, encoded in a file the agent reads at the start of every session.

This is what makes the director role sustainable. You're not doing the same corrections over and over. The system gets better, your attention frees up, and you move to harder problems.

Clean Code Is an Engineering Requirement Now

This one surprised me.

Tangled code confuses the model the same way it confuses a new engineer. If your codebase has a function that does five different things depending on a boolean parameter, the agent will struggle with it. If you have three different ways to do the same thing, the agent will pick the wrong one half the time. A partially migrated codebase that was tolerable when only experienced engineers touched the code becomes a source of constant errors when agents write across it.

Refactoring is more important than ever with AI-generated code. If the structure underneath is bad, the generated code amplifies the bad structure.

If you're the kind of engineer who cares about code quality, this should be encouraging. For years, you might have felt like the person nagging the team while everyone else focused on shipping features. Now the AI produces worse output when quality standards slip. Your instinct toward clean code wasn't fussy. It was prescient.

Your Vocabulary Is a Multiplier

Here's a change I didn't expect. Knowing the name of a pattern or technique is now worth more than it's ever been.

Two engineers describing the same UI to their agent:

Engineer A: "Make a sidebar that can hide, a bar at the top, some card things, and popups in the corner."

Engineer B: "I want a collapsible sidebar with a persistent top nav, CSS-driven card flip transitions on hover, and a toast notification stack in the bottom right using a portal so it renders outside the main layout."

Engineer B gets a result that's way closer to what they actually want. The difference isn't implementation skill, it's awareness. Engineer B has encountered these patterns before and has a rough sense of what they're called. That's enough. The model fills in the rest.

You don't even need to remember the exact name. "I think there's a pattern where you render something outside the normal DOM tree, like a portal or something?" is enough for the model to figure out what you mean. What matters is knowing the concept exists, not memorizing the precise term.

This changes what's worth learning. Deep knowledge is still valuable (you need it to catch mistakes). But surface-level awareness, just knowing a thing exists and roughly what it does, has never paid off more. Every named concept you carry is a permanent tool for directing AI. "Strangler fig" will be valid advice in ten years. "Guard clauses" will still be a real concept when React is deprecated.

Knowledge Debt Is the Real Risk

Technical debt is code that works but is hard to change. Knowledge debt is less visible. It's understanding that was never built.

Every time an engineer ships code they don't fully understand, they leave a small deposit of knowledge debt. In the pre-AI world, the act of typing, of reading documentation, of hitting errors and fixing them, was a forcing function for learning. It wasn't designed to teach you, but it taught you anyway. That forcing function is gone.

The data backs this up. A study by Tilburg University found that after GitHub Copilot adoption, productivity increased but primarily for less-experienced developers, while code required more rework to meet repository standards. Core developers reviewed 6.5% more code while showing a 19% drop in their own original code productivity. SonarSource's 2026 survey found 88% of developers reported at least one negative impact on technical debt, with 53% saying AI created code that "looked correct but was not reliable."

On teams, knowledge debt becomes architectural blindness. The model doesn't know you decided not to use GraphQL for a reason. It doesn't know the team debated and rejected microservices two years ago. It reaches for the most common pattern in its training data, not your pattern. Two AI-generated modules that work perfectly in isolation but nobody realized need to share state. Integration failures, security holes by omission, performance cliffs nobody saw coming because the constraint lives at a level nobody looked at.

The fix is deliberate: when your agent generates something you don't understand, don't just accept it. Ask it to explain. Use it as a teacher, not just a producer. And write Architecture Decision Records so the reasoning behind decisions doesn't live only in someone's head.

The Economics Are Subsidized

Those subscription prices are almost certainly too low.

The $200/month plan gives heavy users access to compute that would cost $3,000–4,000 at API rates. That gap is a subsidy, funded by venture capital and the race for market share. We've seen this before. Uber rides were cheap in 2015 because VC money covered the difference. DoorDash deliveries were cheap for the same reason. Eventually both had to charge real prices.

Some signs of normalization are already showing up. Cursor switched from flat-rate to usage-based pricing and some users saw bills jump from $20/month to $350/week. Sam Altman publicly admitted the $200/month ChatGPT Pro subscription is "currently unprofitable because users hammer it so hard."

That said, compute costs are falling fast. Inference prices dropped roughly 99% between GPT-4's launch and late 2025. Whether falling compute costs outpace the pressure to become profitable is an open question. My guess: prices stay roughly affordable, but the all-you-can-eat unlimited plans don't survive. Usage-based pricing is the direction the whole industry is moving.

Use AI tools aggressively now, while the economics are in your favor. Build the skills and workflows. But don't build on the assumption that current pricing is permanent.

The Productivity Data Is Messier Than You Think

The optimistic numbers get all the attention. 70% productivity increases. 2–5x output gains. 95% weekly adoption.

The measured outcomes tell a different story. The METR study, the most rigorous to date, randomly assigned experienced open-source developers to work with or without AI tools on real issues in repos they maintained. Developers predicted AI would save 24% of their time. The actual result: AI made them 19% slower. They still believed AI had helped, even after seeing the data.

The DORA 2025 report found that while 80%+ of developers report subjective productivity gains, actual delivery metrics (lead time, deployment frequency, change failure rate) remained flat. A 25% increase in AI usage correlated with a 7.2% decrease in delivery stability.

I don't think the skeptical studies are wrong, but I don't think they tell the full story either. The METR study tested expert developers on codebases they already knew deeply. That's the scenario where AI adds the least value. Where AI adds the most value is the opposite: unfamiliar codebases, cross-stack work, boilerplate-heavy tasks, exploration, and parallel execution. Those scenarios are harder to measure in controlled studies.

My experience matches the nuanced view. AI doesn't make me faster at everything. It makes me faster at the tedious parts and gives me time back for the parts that need my full attention.

What I'd Tell Someone Starting Today

Start with two parallel sessions and solid verification. Build your instruction file one correction at a time. Get a feel for which tasks you can delegate and which ones need your hands on the keyboard.

Go deep where you work daily. Go wide everywhere else. Deep gives you intuition. Wide gives you vocabulary. Together, they make you a better director of AI.

And keep your brain on. The model doesn't have curiosity, it doesn't have urgency, it doesn't care if the feature ships on time or if the architecture holds up in six months. It won't push back on a bad product decision or notice that the error message is confusing for users. That's you. And that makes you more important than ever.

AI speeds up whatever you already are.