Senior Software Engineer - Document AI / OCR / ML Inference
We're hiring a Senior Software Engineer to join us for a long-term engagement with a fast-growing UK travel tech client. The role focuses on owning and evolving their production document AI pipeline (OCR plus ML inference on passport and ID documents, CPU only deployment, no GPU).
This is hands-on production engineering, not research. You'll be improving OCR accuracy, optimizing CPU inference (INT8 quantization, ONNX Runtime, batching), refactoring pipeline complexity, and improving observability of a system serving enterprise clients at scale.
What you'll do:
- Take ownership of the existing OCR and document understanding pipeline in Python
- Optimize ML inference for CPU production (ONNX Runtime, quantization, batching, kernel tuning)
- Improve OCR accuracy, latency, throughput, and memory efficiency
- Refactor architectural complexity and reduce technical debt
- Work across ML systems, backend services, APIs, and data pipelines
- Help shape architectural decisions as the platform scales
Required:
- 5 plus years of professional Python software engineering
- Strong experience with OCR and document processing pipelines
- Familiarity with LayoutLMv3, Doctr, PaddleOCR, EasyOCR, Donut, or DocFormer
- Production ML systems experience (not just notebook experimentation)
- ONNX Runtime, quantization, TorchScript, or related CPU inference optimization
- Strong understanding of PDFs, image preprocessing, bounding boxes, and page segmentation
- Experience with systems serving external customers at scale
Nice to have:
- HuggingFace Transformers production experience
- Enterprise SaaS or aviation/travel tech background
- MLflow, Vertex AI, or similar ML operations tooling
- Distributed systems and backend API design experience
Engagement:
- Fully remote
- Long-term (6 plus months) with potential to expand engineering ownership over time
- Part-time (25-30 hrs/week) or full-time (35-40 hrs/week), depending on candidate availability
- Flexible timezone (engagement requires some overlap with UK working hours)
How to apply:
Send a short note covering relevant OCR / document AI work, production ML systems you've shipped, CPU inference optimization experience, GitHub or LinkedIn, current availability, and timezone.
Required languages
| English | B2 - Upper Intermediate |