Anthropic published internal data that changes the conversation about engineering productivity: Claude writes over 80% of the code that reaches production. Engineers ship 8× more code per day than in 2024. Success rate on open tasks: 76% in May 2026, up from 26% in November 2025. The acceleration is real. But the same Anthropic acknowledges: as we start pushing more code through the organization, human code review became a new bottleneck.
The numbers nobody wants to see
Faros AI instrumented 22,000 developers across 4,000 teams and quantified the other side of the coin. Median review time: +441.5%. Defect rate per developer: an increase that accelerated from 9% to 54%. Incidents-to-PR ratio: +242.7%. Code churn: +861%. PRs merged with zero review: +31.3%.
Nobody decided to stop reviewing. Reviewers simply could not keep up with the volume. The result is predictable: PRs pile up, reviewers get overwhelmed, and quality drops — not because AI writes bad code, but because the quality assurance process was not redesigned for the new volume.
GitClear provides the Rosetta Stone: daily AI users produce roughly 4× the raw output of non-users, but the real productivity gain is approximately 12%. The gap between 4× and 12% is where the review crisis lives. Code that enters fast but needs to be rewritten, reverted, or fixed soon after is not productivity — it is noise disguised as velocity.
The structural irony
The builder asks for brakes while accelerating. Anthropic publishes data proving AI builds itself — Claude already writes the majority of production code — then asks for a verifiable global pause mechanism. But only if competitors also pause. Meanwhile, they ship updates every two weeks. The same company that removed its own binding halt commitment in February now asks the world to build one.
This is not hypocrisy — it is the structural tension of the industry. Whoever slows down alone loses market share. Whoever accelerates without brakes loses quality. The result is that quality becomes the hidden cost of velocity, and that cost falls on senior engineers — exactly the people you can least afford to lose.
The evidence accumulates
CMU studied Cursor: +39% merged PRs, +25.1% code complexity. CodeRabbit: AI-generated code carries 1.7× more issues. Veracode: 45% of AI-generated code contains security flaws. And the most striking data point, according to Anthropic internally: devs using AI scored 17% lower on comprehension tests — 50% vs 67% for those who did not use AI.
The pattern is consistent: more code, faster, with less comprehension of what is being merged. The system is breaking at the seams. The acceleration is real. Quality is the hidden cost. And the cost falls on senior engineers — exactly the people you can least afford to lose.
Redesign the pipeline, do not slow down
The answer is not to go back to writing code manually. It is to redesign the quality assurance pipeline for the new volume. Circuit breaker: triage PRs by blast radius before human review. AI-assisted review as a first filter — according to Anthropic, this approach raised the substantive review rate from 16% to 54% with their internal tool. Tiered review: changes to critical modules require mandatory human review; changes to peripheral modules can follow AI-first review.
At Tech86, we apply this reasoning in practice. When we help teams integrate AI into their development process, the first step is not choosing the model — it is redesigning the quality pipeline. Without circuit breaker by blast radius and AI-assisted review as a first filter, AI merely moves the bottleneck from writing code to trusting code. And the trust problem is harder than the writing problem ever was.
