AI Pipeline Engineer (Python) β Document Intelligence $$$$
About the job
Inhubber is an AI- and blockchain-based Contract Lifecycle Management (CLM) platform built for organizations with high security and compliance requirements.
Our platform helps teams create, negotiate, sign, and manage contracts while protecting sensitive data through end-to-end encryption and a security-first architecture. AI extracts and analyzes critical contract data to support smarter decision-making across organizations worldwide.
The Role
We are looking for a hands-on AI Pipeline Engineer (Python) to own and extend our production AI pipelines and deliver new document-analysis capabilities.
This is a delivery-driven role: you will maintain and improve existing extraction pipelines while building new pipelines for document intelligence and contract analysis.
You will collaborate with product engineers via well-defined APIs and typed schemas, while owning the AI kernel β prompts, evaluation, model logic, and quality gates.
What Youβll Do
Own and extend production AI pipelines
- Maintain and optimize our document extraction pipelines (Python / AWS)
- Operate Dockerized components and serverless processing flows
- Ensure reliable processing and correct outputs across the pipeline
- Improve observability (logging, metrics, alerts, traceability, cost monitoring)
Build new document intelligence pipelines
- Design and implement end-to-end pipelines for new document families
- Build evaluation datasets and regression tests
- Prevent silent quality degradation with automated checks
Improve LLM-based document interpretation
- Develop structured extraction, Q&A, and risk signal detection
- Implement structured outputs and retrieval approaches (RAG)
- Add deterministic validation and post-processing
Production readiness
- Implement retries, fallbacks, and safe defaults
- Ensure pipelines remain secure, scalable, and cost-efficient
- Contribute to deployment, versioning, and rollback strategies
What Weβre Looking For
Must-have
- Strong Python in production systems (services/APIs, testing, clean architecture)
- Experience operating AWS serverless pipelines (Lambda + S3)
- Docker and containerized workloads
- Experience debugging automated pipelines in production
- Hands-on experience with LLMs (OpenAI / Azure OpenAI / Anthropic)
- Experience with structured outputs, prompt iteration, RAG, and evaluation methods
Nice-to-have
- Document AI experience (OCR, layout extraction, noisy PDFs)
- Evaluation-driven development (test sets, regression checks, quality metrics)
- Familiarity with TypeScript / Node
- Experience integrating REST APIs or orchestration frameworks
Tech Stack
Frontend: React (TypeScript)
Backend: Java (JEE)
AI Pipelines: Python (AWS Lambda)
Document Processing: Dockerized OCR/NLP components
Storage: AWS S3
LLMs: Azure-hosted ChatGPT / LLM APIs
Infrastructure: AWS + Azure (hybrid)
First 90 Days (What Success Looks Like)
Weeks 1β2: Pipeline takeover
- Set up local and staging runs
- Document architecture and operational playbooks
- Establish baseline metrics and observability
Weeks 3β6: New pipeline delivery
- Implement a new document analysis pipeline
- Create evaluation datasets and regression tests
- Deploy with monitoring and rollback capability
Weeks 7β12: Quality & GenAI foundation
- Improve extraction and Q&A accuracy
- Introduce evaluation-driven iteration
- Build the first GenAI kernel API for contract drafting and risk analysis
- Harden operations (cost controls, retries, runbooks)
If you enjoy building production-grade AI systems for complex documents, we'd love to hear from you.
Required languages
| English | B2 - Upper Intermediate |