How a software team cut PR wait times from 4 hours to 20 minutes without sending their codebase to OpenAI.
Atlas Freight Systems builds logistics and fleet management software. 35 developers, two engineering teams (platform and product), and a codebase that's been growing for eight years. They ship fast — two or three deployments per day. Or they did, until code review became the bottleneck.
Every pull request needs review before it merges. A developer finishes a feature at 2 PM. They open a PR. It sits in the queue. The senior engineer is in back-to-back meetings until 4:30. She picks up the PR at 5, reviews it, leaves three comments, and goes home. The developer sees the comments at 9 AM the next day. A one-day feature takes two days to ship because of review latency.
They tried GitHub Copilot (doesn't review PRs), ChatGPT for code review (sent their entire codebase to OpenAI — their biggest client's contract prohibits third-party AI processing), hiring a dedicated reviewer (he quit after four months).
Foundry runs on a Mac Studio in the engineering team's office. Connected to their GitHub via a webhook — when a PR is opened or updated, Foundry gets notified.
It does a first-pass code review. Not a rubber stamp. A real review.
When a PR opens, Foundry reads the changes, checks against the team's review guidelines (logic errors, security concerns, test coverage, consistency, performance), and posts a structured review with must-fix issues, should-consider suggestions, and looks-good confirmations — including specific line references and suggested fixes.
The senior engineer still reviews every PR. But she's reviewing a PR that's already been through a thorough first pass. She's confirming, not discovering. And she's doing it in 5 minutes instead of 30.
2:15 PM — Developer opens a PR. 340 lines across 4 files. New endpoint for delivery route optimisation based on traffic data.
2:17 PM — Foundry posts review. One SQL injection risk flagged with suggested fix (parameterised query). One missing test case noted (empty traffic data). Auth, validation, error handling, performance all confirmed good.
2:20 PM — Developer fixes the SQL injection, adds the empty-data test, pushes update.
2:22 PM — Foundry re-reviews, confirms both issues resolved, flags ready for human review.
2:35 PM — Senior engineer reviews, confirms fix, approves. Total time from PR to merge: 20 minutes.
| Metric | Before | After | Change |
|---|---|---|---|
| PR wait time | 3-5 hours | 15-25 minutes | -90% |
| Senior review time/day | 3-4 hours | 45-60 mins | -75% |
| Merge-to-deploy time | 1-2 days | same day | -50% |
| Bugs caught in review | 60% | 92% | +32 pts |
| Security issues to production | 1-2/month | 0-1/quarter | -80%+ |
| API cost | £800-1,200/month | £0 (local) | -100% |
Annual impact: 600-700 hours of senior engineer time recovered + £10,000+ in API costs + fewer production incidents (each P1 incident costs £5,000-15,000 in response, fix, and client impact).
Foundry cost: £999 setup + £99/month = £2,187 first year. Existing Mac Studio.
Hardware: Mac Studio M3 Ultra, 512GB unified memory
Model: Qwen3-Coder-30B (Q5_K_M) via llama.cpp
Pipeline: GitHub webhook to fetch diff, analyse changes, post structured review
Review categories: Security (injection, auth, data exposure), logic, tests, consistency, performance
Throughput: 10-30 seconds per PR depending on diff size
False positive rate: ~5-8% on suggestions; near-zero on must-fix items
No-cloud posture: Code fetched locally, reviewed locally, review posted via GitHub API. No code content sent to any third-party AI service.
Works for software teams of 10-100 developers doing regular PRs, companies with proprietary codebases that can't go through third-party AI APIs, teams where senior engineer review time is the deployment bottleneck.