I use AI coding tools. Cursor, Claude Code, Copilot — they're part of my workflow, and they've genuinely changed how fast I move through certain problems. I'm not writing this as a skeptic. I'm writing it as someone who ships production software with these tools daily, and who has 25 years of context to know when something has quietly gone wrong.

That context is exactly what the tools don't have. And when you lack it, the gaps are invisible — which is the most dangerous kind of gap there is.

They don't know why the code is the way it is

Every real codebase has decisions baked into it that aren't documented anywhere. The 200-line function that looks like a refactoring candidate — isn't, because it handles a race condition that showed up three months after launch. The field that looks redundant stores a legacy format required by one specific government API that hasn't been updated since 2011. The inefficient query is inefficient on purpose, because the "correct" version caused a deadlock under real production load.

AI tools see the current state of the code. They reason about it as if every decision was made by a careless programmer. They'll suggest the clean version of that 200-line function, eliminate the race condition protection, and produce code that looks significantly better and fails in ways that are genuinely hard to trace back to the change.

The invisible contract A codebase is full of implicit contracts — things that are true but nowhere written down. An AI tool will violate them confidently and helpfully. The developer who's been in the system for two years knows which corners not to touch. The AI doesn't know they exist.

This isn't a flaw you can engineer away. It's structural. The tool has no memory of the incident reports, the production hotfixes, the half-hour conversation in 2023 where two developers agreed to do it the ugly way because the clean way broke under a specific load pattern. That institutional knowledge lives in people, not in the current state of the files.

They can't own the architecture

AI is excellent at writing code for a problem you hand it clearly. Give it a well-described component, and it will produce something clean and working. What it cannot do reliably is decide how that component fits into the rest of the system.

Architecture decisions carry competing constraints that don't appear in any single file: team size, deployment environment, existing dependencies, how frequently this part of the codebase changes, what happens if it goes down. An AI tool optimizes for the code in front of it. An experienced developer optimizes for the system as a whole — including the parts that aren't visible right now.

What happens when AI owns the architecture? The code works. The tests pass. Then, six months later, every change to one component requires touching five others nobody expected. The data model has evolved into something nobody can fully explain. Dependencies make local sense but no global sense. It's a system assembled from individually reasonable pieces that never cohered into a whole.

Architecture is the set of decisions that are hard to change later. Those decisions need someone who understands the full context — not just the current file.

They generate confidence they haven't earned

This one is subtle, and it's where junior developers and non-technical stakeholders get hurt the most. AI coding tools deliver every answer with the same tone. Proposing to refactor a critical authentication flow sounds identical to proposing to rename a variable. The certainty is the same. The speed is the same. The presentation is the same.

I caught a tool recently suggesting that user authentication tokens be cached in localStorage "for performance." That's a textbook security mistake — XSS attack, stolen session, done. The suggestion appeared alongside genuinely useful code, delivered matter-of-factly, with no indication that this one recommendation was categorically different from the others. The only thing that caught it was someone who knew to look for it.

Red flag If the person reviewing AI-generated code doesn't have enough background to evaluate it critically, the review is cosmetic. The code looking right and the code being right are two different things — and AI tools can't tell you which one you have.

Speed without judgment is a risk multiplier. The faster you ship AI-generated code without the experience to audit it, the faster you accumulate problems that are genuinely hard to diagnose later — because the code looks intentional, even when it isn't.

They don't know when to stop

There's a category of decision that the right answer for is: don't build this. The feature that will take three months and add nothing to retention. The integration that's technically clean but creates a hard dependency on a vendor whose API breaks twice a year. The migration that sounds straightforward until you map the data and discover two systems have been disagreeing on a fundamental business rule for four years.

AI tools will help you build all of these. Enthusiastically. They have no stake in whether it ships, no exposure to the support costs when it breaks, and they won't be in the meeting when the client realizes this isn't what they needed. The AI's job ends at the code. The developer's job includes everything else — including telling a client to stop.

Knowing when not to write the code is one of the most valuable things a senior developer does. It shows up in every kickoff conversation, every scope review, every time someone asks "can we add X?" and the real answer is "you could, but here's why you shouldn't." That judgment isn't in any model.

The right model for using them

None of this means don't use them. I use them every day, and I'm faster for it. The question is where you put the tool and where you keep the human.

AI tools are genuinely fast at: generating boilerplate, writing tests for logic you've already designed, exploring unfamiliar APIs, speeding through implementation of a spec you've already validated. Let them handle the parts where the right answer is unambiguous and the consequences of being wrong are recoverable.

Keep humans in control of: understanding the actual problem, designing the architecture, reviewing anything that touches security or business-critical logic, and deciding what not to build. These are the parts where being wrong is expensive and the tool has no way to know it's wrong.

The combination works. A developer with 25 years of context using AI to move faster through implementation is genuinely more productive. The same AI, handed to someone without that context, is a fast way to build something that looks finished and isn't.

Practical boundary: Before accepting any AI-generated change to existing code, ask: "Does this tool know the history behind what it's modifying?" If the answer is no — and it usually is — review it at the architectural level, not just functionally.