For twenty years, the pull request was where engineering quality was supposed to live. A change wasn't real until someone else read it line by line and stamped it. Code review at the PR stage was the checkpoint, the safety net, the rite of passage. It's also, I'd argue, no longer the place where defects actually hide.
I'm not saying review is dead. I'm saying it moved — and a lot of teams are still standing guard at the door it left through.
What the PR review was actually doing
Strip the PR review down to its load-bearing parts and you find four jobs stacked on top of each other:
- Catching bugs the author missed.
- Enforcing project conventions ("we don't do it like that here").
- Knowledge transfer so the reviewer learns what changed.
- Accountability — a second human looked at this and is willing to put their name on it.
For most of software's history, the PR was the cheapest place to do all four. So we did. And because they happened in the same ritual, we stopped distinguishing them.
That bundle is coming apart.
What changed
Three things shifted at once, and the combination is what reshapes the funnel.
- Agents write the code. A capable agent, given a clear task, produces something that compiles, runs, and broadly does the thing. The variance in quality is real, but the floor has risen sharply — especially for the kind of work that used to dominate junior queues: CRUD endpoints, form screens, glue code, mechanical refactors. Reading that code line by line as a senior engineer is rarely the highest-leverage hour of your day anymore.
- Tests are cheap. Repositories that lived under-tested for years can be brought to serious coverage in a sprint. Unit, integration, end-to-end — the agent will write all three, and they catch the regressions that PR reviewers used to catch by squinting at diffs.
- Conventions are documentable. "We always validate at the boundary." "Migrations go through Flyway." "Don't put business logic in controllers." These used to live in the heads of senior reviewers and get applied during PR. They can now live in a project doc the agent reads before it writes a single line. Encoded once, applied every time, no senior engineer required.
That's three of the four jobs the PR review was doing — bug-catching, convention enforcement, and (frankly) most of the knowledge transfer — moving somewhere else. The fourth, accountability, is the one teams should worry about. It doesn't disappear; it has to be re-attached upstream.
Where review actually pays now
The defects that survive the new pipeline aren't the ones a careful reviewer would have spotted in a diff. They're the ones that were baked in long before any code was written:
- The wrong feature was built. The ticket was ambiguous, the agent picked a plausible interpretation, and now there are 400 lines of working code solving the wrong problem.
- The architecture was wrong. The plan looked fine until the third subtask hit something the planner didn't foresee — and now the foundation is poured.
- The premise was wrong. Including the premise of the tests. An agent will happily write a test suite that passes against its own buggy implementation. Coverage is real; correctness is a separate question.
These are upstream defects. They compound downstream. By the time they reach a PR, a reviewer reading the diff has almost no chance of catching them — the code looks fine, the tests pass, the conventions are followed. The bug is one level up the funnel.
So that's where the review goes. Specifically:
- Review the ticket before it gets handed to whoever (human or agent) is going to plan it. Is this the right thing to build? Is the scope honest? Is the success criterion a real one or a wish?
- Review the plan before it gets handed to the agent that is going to build it. Does the breakdown hold? Are the dependencies real? Where's the part the planner waved at?
- Review what the tests are asserting, not just whether they pass. A green suite that pins the wrong invariants is worse than no suite — it manufactures confidence you haven't earned.
- Reserve PR-stage review for blast radius, not line count. Migrations, security boundaries, perf-critical paths, anything irreversible — those still want a human eye on the diff. A 600-line CRUD screen doesn't.
This is what shift-left has always meant. The agent era doesn't invent it; it just makes the old defaults visibly wrong.
The honest tradeoffs
A few things I want to say plainly, because the optimistic version of this argument tends to skip them:
- You are trusting the test suite a lot more than you used to — and most teams aren't treating tests with the discipline that earns that trust. The same things that made agent-written code reliable — clear conventions, documented expectations, an explicit definition of done, examples of what "good" looks like — have to apply to the test suite too. Most teams skip this. They'll spend a week writing a careful project doc for how features should be implemented and then let the agent write tests with no guidance at all, no convention for what to assert, no rule about what's allowed to be mocked, no standard for what an integration test actually has to integrate. The result is a test suite that grows fast, runs green, and quietly pins the wrong invariants. Moving review upstream while the safety net has holes isn't shifting review — it's removing it. The shift only works if the net is real. But that's a story for another day.
- Accountability still has to live somewhere. "An agent shipped it" is not a release note anyone wants to read after an incident. The upstream checkpoints are where humans re-attach their name to the work. Skip the checkpoints, and you've outsourced the decisions, not just the typing.
- Senior judgment doesn't get cheaper, it gets concentrated. The hours saved on line-by-line review don't vanish — they get reinvested in deciding what to build, how to break it down, and where the sharp edges are. The job is still hard. It just happens earlier in the day.
The point
The PR was a great checkpoint when it was the only one we had. It isn't anymore. Treating it as the primary quality gate while letting tickets, plans, and test premises slip through unreviewed is solving last decade's problem with last decade's tool.
Move the gate. Keep it strict. Just put it where the defects actually are.