Senior AI Engineer (Voice / Multi-Agent / Low-Latency Systems)
Location: Remote-first, Europe (Spain or UK preferred)
Office: Barcelona available
Engagement: Full-time, long-term
Employment: In-house
About the Company
Our client is a European health technology company building an AI-driven virtual assistant for clinicians — a real-time system that helps healthcare professionals perform tasks faster and more accurately.
Their mission is to build intelligent, low-latency AI agents that communicate naturally through text and voice, supporting healthcare providers, payers, and pharmaceutical companies.
Core Stack: Python, FastAPI, Docker, Kubernetes, Argo CD, gRPC, Redis, Postgres, WebRTC
Focus Areas: Real-time AI agents, voice systems, event-driven architectures, multi-agent orchestration
Who We’re Looking For
This is a Senior-level (6–7+ years) AI Engineer role.
We’re looking for someone who is highly experienced in building AI/ML systems - not just experimenting with the latest APIs, but shipping real production systems.
You are a software-first AI engineer:
- You understand architectures, reliability, observability, SLAs, and production constraints - not just prompts.
Mandatory background:
- Strong AI/ML expertise + solid software engineering fundamentals
- Experience that predates the LLM boom, showing a foundational understanding of ML/Software Engineering beyond prompt engineering.
- Proven experience with production AI/ML systems
- Previous startup experience (required)
Deep expertise in at least one:
- Voice AI systems
- Multi-agent architectures
- Low-latency software systems
What You’ll Own
You will take ownership of critical AI infrastructure components, including:
AI Brain Service (End-to-End)
- Architecture design, SLAs, latency budgets
- Failure modes & reliability engineering
- Production rollouts and system evolution
Real-Time & Voice Systems
- Streaming text/voice
- VAD, barge-in, turn-taking, interruption handling
- WebRTC, SIP, LiveKit (strong plus)
- Latency optimization under real-world constraints
Multi-Agent Orchestration
- Planner–executor–critic patterns
- Role routing & shared memory
- Tool routers & coordination protocols
- Evaluation-driven iteration
RAG & Retrieval Engineering
- Hybrid retrieval & re-ranking
- Query rewriting, compression, caching
- Freshness & grounding
- Faithfulness and relevancy evaluation
Evaluation & Observability
- Pre-call validation & safety enforcement
- In-call tracing (prompts, tools, tokens, latency, cost)
- Post-call automated evals (hallucination, safety, regressions)
- OpenTelemetry instrumentation
- Shadow evals & drift detection
Minimum Qualifications
- 6–7+ years of ML/backend engineering in product environments
- Strong Python expertise (FastAPI, asyncio, pydantic)
- Experience shipping production AI/ML systems
- Experience in startup environments
- Real-time system experience
- Strong understanding of architecture, observability & reliability
Nice to Have
- DSPy, MiPRO, GEPA
- LLM evaluation frameworks / LLM-as-judge setups
- WebRTC/SRTP, SIP, TURN/SFU tuning
- GCP (Cloud Run, GKE, Pub/Sub, Vertex AI)
- Healthcare domain experience
Example Problems You’ll Solve
- Push median voice round-trip below 2 seconds while preserving turn-taking.
- Implement OTEL-first tracing for agent graphs with automated evaluation triggers.
- Improve RAG with hybrid retrieval + measurable faithfulness gains.
- Turn EHR integrations into reliable LLM tools.
Working Style
- Remote-first (EU time zones preferred)
- Barcelona office available
- Direct collaboration with Head of AI and Engineering
- Fast iteration, high ownership
- Strong architectural influence
Required skills experience
| Python | 5 years |
| FastAPI | 3 years |
Required languages
| English | B2 - Upper Intermediate |