Claude Code is looking to close the loop on all kinds of coding workflows.
Anthropic has launched Code Review, a new feature for its Claude Code platform that deploys a team of AI agents to automatically review pull requests (PRs) for bugs. The feature, now available in research preview for Team and Enterprise plan users, is designed to bring depth and thoroughness to code review — a process that has increasingly become a bottleneck as AI-assisted development accelerates.

The Problem: Code Output Is Outpacing Review
According to Anthropic, code output per engineer at the company has grown 200% in the last year. As developers ship more code faster — aided in no small part by AI coding tools — the human bandwidth required to review that code simply hasn’t kept pace. The result is that many PRs receive superficial skims rather than thorough examinations, increasing the risk of bugs slipping into production.
Code Review is Anthropic’s answer to this structural problem. Rather than a lightweight pass, it dispatches multiple agents to investigate a PR in parallel, verify findings to reduce false positives, and rank bugs by severity. The output is a single high-signal summary comment on the PR, supplemented by inline annotations for specific issues.
How the Multi-Agent System Works
When a developer opens a PR, Code Review automatically triggers a team of agents. These agents split up the work: some hunt for bugs, others verify whether those bugs are genuine, and the final output is a ranked list based on severity. Reviews scale with the PR — larger and more complex changes draw more agents and a deeper analysis, while small diffs get a lighter pass. The average review takes around 20 minutes, according to Anthropic’s testing.
Critically, Code Review does not approve PRs. That decision remains with human reviewers. The tool is intended to close the gap between what’s shipping and what’s actually being scrutinized — not to replace engineering judgment.
Numbers From Internal Use
Anthropic has been running Code Review on most of its own PRs for several months, and the internal data is compelling. Before the tool, only 16% of PRs received substantive review comments. That number has since risen to 54%. On large PRs — those with over 1,000 lines changed — 84% surface findings, averaging 7.5 issues each. On small PRs under 50 lines, 31% get findings with an average of 0.5 issues. Fewer than 1% of findings are marked incorrect by engineers.
One incident illustrates the value well. A one-line change to a production service — the kind of diff that routinely gets a quick approval — was flagged by Code Review as critical. The change would have broken authentication for the service entirely, a failure mode easy to miss in a diff but obvious once pointed out. The bug was caught before merge.
Early-access customers have reported similar catches. On a ZFS encryption refactor in TrueNAS’s open-source middleware, Code Review surfaced a pre-existing bug in adjacent code: a type mismatch that was silently wiping the encryption key cache on every sync — a latent issue that a human reviewer scanning the diff wouldn’t have gone looking for.
Pricing and Controls
Code Review is positioned as a premium, depth-first option and is priced accordingly. Reviews are billed on token usage and generally average between $15 and $25, scaling with PR size and complexity. This makes it more expensive than Anthropic’s existing Claude Code GitHub Action, which remains open source and available for teams that want a lighter-weight solution.
For organizations concerned about cost control, admins have several levers. They can set monthly spend caps across all reviews, enable Code Review selectively on specific repositories, and access an analytics dashboard showing PRs reviewed, acceptance rate, and total costs.
Availability
Code Review is available now in research preview (beta) for Team and Enterprise plans. Admins can enable it through Claude Code settings, install the GitHub App, and select which repositories to run reviews on. For developers, no additional configuration is needed — once enabled by an admin, reviews run automatically on new PRs.
The launch is another sign that AI tools are moving deeper into the software development lifecycle, from writing code to reviewing it. For teams navigating the tradeoff between shipping velocity and code quality, it signals a future where the review bottleneck may no longer be purely a human constraint.