CASE STUDY 01 — DOCUMENT PROCESSING

Document Processing Pipeline

How a 30-person operations team replaced manual invoice handling with a local AI pipeline.

The problem

A growing operations team receives 50-200 business documents per day — invoices, purchase orders, renewal notices, contracts, and quotes. They arrive as PDF attachments via email, shared drives, and portal downloads.

At 100 documents/day, that's 10-14 hours of human time per day spent on document triage and data extraction. Not analysis. Not decision-making. Just reading, classifying, and re-keying information that's already written down.

What they tried first: OpenAI API for document extraction — worked technically, but the company's information security policy changed: financial documents can't leave the building. API bills were climbing past £1,800/month. Off-the-shelf OCR software read text but didn't understand document structure.

1.5-2

FTE just for processing

24-48h

Processing lag

3-8%

Error rate

£1,800

Monthly API cost

The Foundry setup

Foundry was installed on a Mac Studio (M3 Ultra, 512GB RAM) already in the office. The machine was being used for video editing — it had the capacity but wasn't doing anything AI-related.

What was configured:

Local model — a 30B-parameter model running via llama.cpp, optimised for document understanding. Runs entirely on-device. No document ever leaves the Mac Studio.
Hermes document pipeline — a watched-folder workflow: documents dropped into a secure intake folder, system classifies each document, extracts structured fields, flags missing or inconsistent information, preserves the original PDF untouched alongside the extracted data, all outputs marked "requires human review" before action.
Observability dashboard — llm_stats shows model health and memory usage, documents processed/queued/flagged, processing time per document, any errors or anomalies.

What was NOT configured: No outbound internet access for document processing. No automatic payments, approvals, or system-of-record updates. No cloud API calls — everything runs locally.

The transformation

Metric	Before	After	Change
Time per document	8-13 min	20-30 sec	-95%
Capacity	100-120 docs/day	500+ docs/day	5x
Processing lag	24-48 hours	Under 1 minute	-99%
Error rate	3-8%	<0.5%	-94%
FTE required	1.5-2.0	0.3 (review only)	-85%
API cost	£1,800/month	£0 (local)	-100%

Annual savings: £21,600 in API costs + £35,000-50,000 in freed staff time = £56,000-71,600/year.

Hardware cost: £0 (existing Mac Studio). Foundry setup: £999 + £99/month = £2,187 first year.

ROI: 25-32x in year one.

What stayed cloud

Email delivery — documents arrive via email, processed locally after download
Cloud storage backup — encrypted backups of processed data
Web search and research — still goes to cloud APIs
Large model inference for complex reasoning — occasional tasks still use OpenAI, volume dropped 90%+

The point isn't "everything local." It's "the right workloads local, with a clear line between what stays cloud and what doesn't."

What it doesn't do

Does not make decisions. It extracts, classifies, and flags. A human approves every action.
Does not send emails or update systems automatically. All outputs are drafts for human review.
Does not handle every document type perfectly. Complex contracts may need manual review. The system flags these rather than guessing.
Does not replace the operations team. It removes the data-entry grind so they can focus on exceptions, relationships, and actual operations work.

What the team says

"Before Foundry, I spent my morning opening invoices. Now I spend my morning reviewing extracted data that's already 95% correct, and I have time to actually chase the late payers and talk to suppliers."Operations admin, 6 weeks after deployment

"We were going to hire another admin person. We didn't need to. The pipeline handles the volume we had and the growth we're planning for."Operations lead

"The audit trail alone justified it. When finance asked 'where did this number come from,' we could show them the original PDF, the extraction, and who approved it. That used to take an hour of folder-hunting."Team lead

Technical details

Hardware: Mac Studio M3 Ultra, 512GB unified memory
Model: 30B-parameter model via llama.cpp
Pipeline: Watched folder intake, document classification, field extraction, human review queue
Throughput: 20-30 seconds per document
No-cloud posture: All processing local. No outbound API calls.
Observability: llm_stats dashboard — model health, memory, documents processed/queued/flagged

Is this right for you?

This setup works well for teams that process 50+ structured documents per day, have data sovereignty requirements, and want to reduce data-entry overhead without replacing their systems stack.

Book a Foundry Fit Review →

← Back to all case studies | Next: Conveyancing →