AI Is Making Programmers Worse at Debugging. The 'Back to Hand-Written Code' Crowd Has a Point.

New research shows AI coding tools boost speed but erode the mental models developers need to debug and maintain code.

TokenDance Editors·11 May 2026

AI Is Making Programmers Worse at Debugging. The 'Back to Hand-Written Code' Crowd Has a Point.

The Two-Letter-Grade Gap

When Anthropic researchers tested 52 software engineers learning a new Python library, they split them into two groups: one encouraged to use AI assistance, one coding by hand. Both groups completed the same short exercise. Then came the quiz on concepts they'd used minutes earlier. The AI-assisted group scored 17 percentage points lower—50% compared to 67% for the hand-coding group. That's nearly two letter grades. The biggest gaps appeared in debugging and understanding when code fails and why. Using AI sped up the task slightly, but the difference wasn't statistically significant. The troubling implication: developers may lack the skills to validate and debug AI-written code if their skill formation was inhibited by using AI in the first place. You can't catch errors in logic you never built a mental model for.

The Cognitive Offloading Problem

The productivity paradox shows up across multiple studies. While some controlled experiments report 55% faster task completion, a METR randomized trial with 246 tasks found experienced developers actually worked 19% slower with AI coding tools on complex scenarios requiring contextual knowledge. MIT Media Lab research using EEG to measure brain activity during essay writing found that LLM users showed systematically weaker neural connectivity compared to those working without assistance. Brain connectivity scaled down with the amount of external support. When LLM users were later asked to work without AI, they exhibited weaker neural engagement and struggled with memory recall—they couldn't even quote from essays they'd written minutes earlier. InfoQ reported developers consistently believed they were working faster while objective measurements showed decreased efficiency. The gap between perceived and actual productivity reveals how AI assistance can mask skill erosion.

Where AI Helps and Where It Hurts

Not all AI assistance degrades skills equally. Anthropic's study found that how someone used AI influenced retention. Participants who showed stronger mastery used AI not just to produce code but to build comprehension—asking follow-up questions, requesting explanations, posing conceptual questions while coding independently. A JetBrains survey found nearly 90% of developers save at least an hour weekly using AI tools, with 20% saving eight hours or more. The biggest gains come from repetitive tasks: writing boilerplate code, generating tests, documentation, refactoring, and looking up syntax. But 84% of developers now use or plan to use AI tools, while 46% say they don't trust the accuracy of outputs—up from 31% the previous year. Three-quarters would still prefer asking a colleague over trusting AI answers. Crucially, 61.3% said they want to fully understand their code, not just have it work.

The Maintenance Cost No One Talks About

Raw velocity metrics create misleading indicators. Lines of code per hour increase while system complexity grows. Pull requests merge faster but require more review cycles. Feature velocity improves while technical debt accumulates. Individual task completion accelerates while integration time expands. IEEE research examined how security properties change when initially secure code gets modified with AI assistance. Teams reported decreased attention to security patterns when AI suggestions appear authoritative. Studies on programmer role evolution document impacts on debugging and problem-solving capabilities that compound over time. Research on open source ecosystems warns that 'vibe coding'—where developers delegate engineering to LLMs—removes organic library selection, replacing it with whatever was most prevalent in training data. Even for popular projects, website visits decrease as downloads and documentation are replaced by chatbot interactions, reducing sponsorships and community engagement. Stack Overflow usage has plummeted as developers offload questions to AI rather than building searchable knowledge.

The Discipline AI Demands

Anthropic's internal survey of 132 engineers and researchers found AI use is radically changing work. Engineers report getting more done, becoming more full-stack, and tackling previously-neglected tasks. But they also worry about losing deeper technical competence and becoming less able to effectively supervise AI outputs. Some found more AI collaboration meant less collaboration with colleagues. The path forward isn't rejecting AI tools—it's using them with discipline. One developer put it bluntly: 'AI coding assistants are not a shortcut to competence, but a powerful tool that requires a new level of discipline.' That means treating AI-generated code as a starting point requiring validation, not a finished product. It means deliberately practicing the slow, line-by-line construction of mental models that hand-coding builds. It means knowing when to reach for AI for boilerplate and when to code by hand to maintain the debugging skills that AI can't replace. The developers who thrive will be those who understand what AI is good for—and what it quietly erodes when overused.

Sources

[1]How AI assistance impacts the formation of coding skills — Anthropic
[2]How to Learn Python for Data Science Fast in 2026 (Without Wasting Time) — Towards Data Science
[3]The Real Cost of AI Coding: Skills vs. Products — Augment Code
[4]5 Benefits of AI Coding You Should Know in 2026 [Explained] — Zencoder
[5]AI use may speed code generation, but developers’ skills suffer — InfoWorld
[6]How AI Is Transforming Work at Anthropic — Anthropic
[7]How Vibe Coding Is Killing Open Source — Hackaday
[8]84% of software developers are now using AI, but nearly half 'don't trust' the technology over accuracy concerns — IT Pro
[9]Project Overview ‹ Your Brain on ChatGPT — MIT Media Lab

Comments

No comments yet — be the first to weigh in.