Senior AI Engineer

$$$

Role in one line

Build the agent runtime โ€” the brains of the platform: stateful LangGraph workflows, RAG, tool integrations, model routing, human-in-the-loop gates, and audit-grade logging โ€” in Python.

Context

We are building a multi-agent AI platform for a regulated banking client. The agent runtime is a Python / FastAPIservice exposing OpenAI-compatible endpoints, with agents implemented as LangGraph graphs and tools exposed over MCP. Sensitive workloads (KYC, PII) run on self-hosted open-weight models on EU infrastructure (Hetzner); less sensitive workloads route to frontier models via AWS Bedrock EU. The AI engineer owns the agent logic, retrieval, and the model / tool layer.

What you will work on

  • Design and build stateful agent graphs in LangGraph โ€” multi-stage workflows, tool calling, and interrupt-based human-in-the-loop gates (e.g. the six-stage KYC assistant).
  • Build RAG pipelines: ingestion, chunking, embeddings, retrieval, and grounded Q&A with source citations (e.g. meeting analysis, search lens).
  • Implement structured-output patterns where the LLM emits validated JSON only and deterministic engines do the rest โ€” keeping the model out of the render path for document generation.
  • Integrate tools over MCP (streamable-http) and wire agents to bank data sources behind the tool layer.
  • Work across hosted and self-hosted models โ€” a per-task model router selecting between Bedrock EU and on-prem open-weight models โ€” with prompt design, evaluation, and cost / latency awareness.
  • Build audit-grade logging and provenance for every AI operation, with PII handled by hashing rather than plaintext.

Must-have

  • Strong Python: production-grade, typed, tested โ€” not notebook-only.
  • LangChain and LangGraph: hands-on building stateful graphs, tool / function calling, and human-in-the-loop interrupts.
  • LLM application patterns: RAG, prompt design, structured output / JSON-schema validation, and evaluation of agent behaviour.
  • Retrieval stack: vector stores (pgvector and / or Qdrant) and embeddings.
  • Serving: FastAPI for agent endpoints; comfort with async Python.
  • Model access: working with hosted APIs (AWS Bedrock) and self-hosted open-weight models (Llama, Mistral, Qwen).

Nice-to-have

  • Banking / regulated experience; awareness of audit logging, PII handling, and EU data residency.
  • MCP (Model Context Protocol) tool integration.
  • ASR / speech-to-text pipelines (for meeting analysis).
  • MLOps for self-hosted inference (vLLM, Ollama, or similar) on EU infrastructure (Hetzner).
  • Workflow orchestration (e.g. Camunda) and deterministic document rendering (docxtpl, openpyxl).

Tech stack you will touch

Python, LangChain, LangGraph, FastAPI ยท pgvector / Qdrant, embeddings ยท AWS Bedrock EU and self-hosted open-weight LLMs on Hetzner ยท MCP servers (streamable-http) ยท Docker, Git, CI/CD.

Ways of working

  • Remote, distributed delivery team; English working language; scrum-light cadence.
  • Banking-grade rigor: every AI operation logged and auditable, human-in-the-loop by design, compliance built into the architecture โ€” not bolted on.
  • Agents prepare, retrieve, draft, and propose; the trigger and the final action always stay with a human.

Important
As this is a Germany-based project, we are primarily seeking candidates based in Western Ukraine, with Vinnytsia and Lviv being our preferred locations.

Required languages

English B2 - Upper Intermediate
Ukrainian Native
Published 12 June
7 views
ยท
1 application
To apply for this and other jobs on Djinni login or signup.
Loading...