CASE STUDY 05 — CODE REVIEW

Code Review Pipeline

How a software team cut PR wait times from 4 hours to 20 minutes without sending their codebase to OpenAI.

The company

Atlas Freight Systems builds logistics and fleet management software. 35 developers, two engineering teams (platform and product), and a codebase that's been growing for eight years. They ship fast — two or three deployments per day. Or they did, until code review became the bottleneck.

The problem

Every pull request needs review before it merges. A developer finishes a feature at 2 PM. They open a PR. It sits in the queue. The senior engineer is in back-to-back meetings until 4:30. She picks up the PR at 5, reviews it, leaves three comments, and goes home. The developer sees the comments at 9 AM the next day. A one-day feature takes two days to ship because of review latency.

3-5h
Avg PR wait time
3-4h
Senior daily review time
2-3
Production bugs/month
£800-1,200
Monthly API cost

They tried GitHub Copilot (doesn't review PRs), ChatGPT for code review (sent their entire codebase to OpenAI — their biggest client's contract prohibits third-party AI processing), hiring a dedicated reviewer (he quit after four months).

What Foundry does

Foundry runs on a Mac Studio in the engineering team's office. Connected to their GitHub via a webhook — when a PR is opened or updated, Foundry gets notified.

It does a first-pass code review. Not a rubber stamp. A real review.

When a PR opens, Foundry reads the changes, checks against the team's review guidelines (logic errors, security concerns, test coverage, consistency, performance), and posts a structured review with must-fix issues, should-consider suggestions, and looks-good confirmations — including specific line references and suggested fixes.

The senior engineer still reviews every PR. But she's reviewing a PR that's already been through a thorough first pass. She's confirming, not discovering. And she's doing it in 5 minutes instead of 30.

What it looks like day to day

2:15 PM — Developer opens a PR. 340 lines across 4 files. New endpoint for delivery route optimisation based on traffic data.

2:17 PM — Foundry posts review. One SQL injection risk flagged with suggested fix (parameterised query). One missing test case noted (empty traffic data). Auth, validation, error handling, performance all confirmed good.

2:20 PM — Developer fixes the SQL injection, adds the empty-data test, pushes update.

2:22 PM — Foundry re-reviews, confirms both issues resolved, flags ready for human review.

2:35 PM — Senior engineer reviews, confirms fix, approves. Total time from PR to merge: 20 minutes.

The transformation

MetricBeforeAfterChange
PR wait time3-5 hours15-25 minutes-90%
Senior review time/day3-4 hours45-60 mins-75%
Merge-to-deploy time1-2 dayssame day-50%
Bugs caught in review60%92%+32 pts
Security issues to production1-2/month0-1/quarter-80%+
API cost£800-1,200/month£0 (local)-100%

Annual impact: 600-700 hours of senior engineer time recovered + £10,000+ in API costs + fewer production incidents (each P1 incident costs £5,000-15,000 in response, fix, and client impact).

Foundry cost: £999 setup + £99/month = £2,187 first year. Existing Mac Studio.

What the team says

"The first week, Foundry caught a SQL injection in a PR that I would have missed at 5 PM on a Friday. I've been reviewing code for twelve years. That stung — but it proved the point."David, CTO
"I used to wait half a day for someone to look at my code. Now it's reviewed before I've finished my coffee. The feedback is specific — line numbers, suggested fixes, not just 'looks fine.'"Sarah, developer
"The national retailer contract clause about third-party AI was the blocker for us using ChatGPT for review. Foundry runs on our hardware. Our code never leaves the building. Procurement is happy, legal is happy, and we're shipping faster."David, CTO

Technical details

Hardware: Mac Studio M3 Ultra, 512GB unified memory
Model: Qwen3-Coder-30B (Q5_K_M) via llama.cpp
Pipeline: GitHub webhook to fetch diff, analyse changes, post structured review
Review categories: Security (injection, auth, data exposure), logic, tests, consistency, performance
Throughput: 10-30 seconds per PR depending on diff size
False positive rate: ~5-8% on suggestions; near-zero on must-fix items
No-cloud posture: Code fetched locally, reviewed locally, review posted via GitHub API. No code content sent to any third-party AI service.

Is this right for your team?

Works for software teams of 10-100 developers doing regular PRs, companies with proprietary codebases that can't go through third-party AI APIs, teams where senior engineer review time is the deployment bottleneck.

Book a Foundry Fit Review →

← Back to all case studies