How a 30-person operations team replaced manual invoice handling with a local AI pipeline.
A growing operations team receives 50-200 business documents per day — invoices, purchase orders, renewal notices, contracts, and quotes. They arrive as PDF attachments via email, shared drives, and portal downloads.
At 100 documents/day, that's 10-14 hours of human time per day spent on document triage and data extraction. Not analysis. Not decision-making. Just reading, classifying, and re-keying information that's already written down.
What they tried first: OpenAI API for document extraction — worked technically, but the company's information security policy changed: financial documents can't leave the building. API bills were climbing past £1,800/month. Off-the-shelf OCR software read text but didn't understand document structure.
Foundry was installed on a Mac Studio (M3 Ultra, 512GB RAM) already in the office. The machine was being used for video editing — it had the capacity but wasn't doing anything AI-related.
What was configured:
What was NOT configured: No outbound internet access for document processing. No automatic payments, approvals, or system-of-record updates. No cloud API calls — everything runs locally.
| Metric | Before | After | Change |
|---|---|---|---|
| Time per document | 8-13 min | 20-30 sec | -95% |
| Capacity | 100-120 docs/day | 500+ docs/day | 5x |
| Processing lag | 24-48 hours | Under 1 minute | -99% |
| Error rate | 3-8% | <0.5% | -94% |
| FTE required | 1.5-2.0 | 0.3 (review only) | -85% |
| API cost | £1,800/month | £0 (local) | -100% |
Annual savings: £21,600 in API costs + £35,000-50,000 in freed staff time = £56,000-71,600/year.
Hardware cost: £0 (existing Mac Studio). Foundry setup: £999 + £99/month = £2,187 first year.
ROI: 25-32x in year one.
The point isn't "everything local." It's "the right workloads local, with a clear line between what stays cloud and what doesn't."
Hardware: Mac Studio M3 Ultra, 512GB unified memory
Model: 30B-parameter model via llama.cpp
Pipeline: Watched folder intake, document classification, field extraction, human review queue
Throughput: 20-30 seconds per document
No-cloud posture: All processing local. No outbound API calls.
Observability: llm_stats dashboard — model health, memory, documents processed/queued/flagged
This setup works well for teams that process 50+ structured documents per day, have data sovereignty requirements, and want to reduce data-entry overhead without replacing their systems stack.