Does AI actually increase software engineer productivity?

Yes and no. According to Anthropic, engineers ship 8× more code per day than in 2024, and the success rate on open tasks rose from 26% (November 2025) to 76% (May 2026). But according to GitClear, the real productivity gain is ~12% — the gap between 4× raw output and 12% real gain is where the review crisis lives. The acceleration is real; quality is the hidden cost.

What is code churn and why does it matter?

Code churn is the percentage of code that is reverted, rewritten, or removed shortly after being merged. According to Faros AI, code churn rose 861% in teams with daily AI use. High code churn means code is being produced rapidly but discarded soon after — it is the clearest indicator that speed without quality is waste disguised as productivity.

How do you prevent review from becoming the new bottleneck?

Redesign the pipeline, do not slow down. Triage by blast radius before human review. AI-assisted review as a first filter — Anthropic raised the substantive review rate from 16% to 54% with this approach. Tiered review: critical modules require mandatory human review; peripheral modules can follow AI-first review. The goal is not to review less — it is to review better, concentrating human effort where it matters.

AI Writes 8× More Code — and Review Time Jumps 441%

Q: Is AI-generated code less secure than human-written code?

The data points in that direction. According to Veracode, 45% of AI-generated code contains security flaws. According to CodeRabbit, AI-generated code carries 1.7× more issues. And according to Anthropic internally, devs using AI scored 17% lower on comprehension tests (50% vs 67%). AI does not deliberately introduce vulnerabilities — but it generates code without the security context that an experienced engineer would apply naturally.

Anthropic published internal data that changes the conversation about engineering productivity: Claude writes over 80% of the code that reaches production. Engineers ship 8× more code per day than in 2024. Success rate on open tasks: 76% in May 2026, up from 26% in November 2025. The acceleration is real. But the same Anthropic acknowledges: as we start pushing more code through the organization, human code review became a new bottleneck.

The numbers nobody wants to see

Faros AI instrumented 22,000 developers across 4,000 teams and quantified the other side of the coin. Median review time: +441.5%. Defect rate per developer: an increase that accelerated from 9% to 54%. Incidents-to-PR ratio: +242.7%. Code churn: +861%. PRs merged with zero review: +31.3%.

Nobody decided to stop reviewing. Reviewers simply could not keep up with the volume. The result is predictable: PRs pile up, reviewers get overwhelmed, and quality drops — not because AI writes bad code, but because the quality assurance process was not redesigned for the new volume.

GitClear provides the Rosetta Stone: daily AI users produce roughly 4× the raw output of non-users, but the real productivity gain is approximately 12%. The gap between 4× and 12% is where the review crisis lives. Code that enters fast but needs to be rewritten, reverted, or fixed soon after is not productivity — it is noise disguised as velocity.

The structural irony

The builder asks for brakes while accelerating. Anthropic publishes data proving AI builds itself — Claude already writes the majority of production code — then asks for a verifiable global pause mechanism. But only if competitors also pause. Meanwhile, they ship updates every two weeks. The same company that removed its own binding halt commitment in February now asks the world to build one.

This is not hypocrisy — it is the structural tension of the industry. Whoever slows down alone loses market share. Whoever accelerates without brakes loses quality. The result is that quality becomes the hidden cost of velocity, and that cost falls on senior engineers — exactly the people you can least afford to lose.

The evidence accumulates

CMU studied Cursor: +39% merged PRs, +25.1% code complexity. CodeRabbit: AI-generated code carries 1.7× more issues. Veracode: 45% of AI-generated code contains security flaws. And the most striking data point, according to Anthropic internally: devs using AI scored 17% lower on comprehension tests — 50% vs 67% for those who did not use AI.

The pattern is consistent: more code, faster, with less comprehension of what is being merged. The system is breaking at the seams. The acceleration is real. Quality is the hidden cost. And the cost falls on senior engineers — exactly the people you can least afford to lose.

Redesign the pipeline, do not slow down

The answer is not to go back to writing code manually. It is to redesign the quality assurance pipeline for the new volume. Circuit breaker: triage PRs by blast radius before human review. AI-assisted review as a first filter — according to Anthropic, this approach raised the substantive review rate from 16% to 54% with their internal tool. Tiered review: changes to critical modules require mandatory human review; changes to peripheral modules can follow AI-first review.

At Tech86, we apply this reasoning in practice. When we help teams integrate AI into their development process, the first step is not choosing the model — it is redesigning the quality pipeline. Without circuit breaker by blast radius and AI-assisted review as a first filter, AI merely moves the bottleneck from writing code to trusting code. And the trust problem is harder than the writing problem ever was.

AI Writes 8× More Code — and Review Time Jumps 441%

The numbers nobody wants to see

The structural irony

The evidence accumulates

Redesign the pipeline, do not slow down

Frequently Asked Questions

Does AI actually increase software engineer productivity?

What is code churn and why does it matter?

Is AI-generated code less secure than human-written code?

How do you prevent review from becoming the new bottleneck?

Blog — Get in Touch

Schedule a Meeting

Email

WhatsApp

Address

Tech86 Specialist

We Value Your Privacy