Senior AI Engineer
An AI consulting group that builds serious, production-grade systems for companies ready to operationalize artificial intelligence across their business. We do not build gimmicks. The main focus is on custom AI solutions that integrate directly into operations and drive measurable results.
We’re looking for a Senior AI Engineer for one of our clients — an AI consulting group that builds production-grade systems for companies ready to operationalize artificial intelligence across their business. This is a hands-on role for someone who doesn’t just experiment with AI, but has built, deployed, and optimized real AI systems in complex, production environments.
This role is for someone who takes full ownership of AI systems — from architecture to performance. You’ll work at the intersection of LLMs, distributed data processing, and backend systems, building scalable AI pipelines and knowledge systems in a high-impact environment where performance, reliability, and flexibility (including airgapped deployments) matter.
Your responsibilities will include:
Design and build distributed data pipelines (Apache Beam, Python) for large-scale document and model processing;
Implement horizontally scalable LLM-powered extraction workflows across thousands of entities;
Build runner-agnostic pipelines that run across multiple environments (local, cloud, on-prem);
Design and implement entity resolution and cross-referencing systems with confidence scoring;
Architect and build graph-based Knowledge Bases (Dgraph or similar), including schema design and query optimization;Develop pluggable LLM extractor frameworks (classification, relationships, completeness, document linking);
Build abstraction layers for LLM providers (Anthropic Claude, Google Gemini, local models via Ollama/vLLM);
Optimize LLM usage for cost, latency, and reliability in production environments;
Design and implement backend service layers (Go-based APIs) connecting AI pipelines with agent systems;
Ensure observability, debugging, and performance monitoring across distributed AI systems;
Package and deploy systems using Docker (docker-compose, containerized environments);
Collaborate directly with client teams on integrations, validation, and production rollout.
What we expect from you:
5–8+ years of experience in AI Engineering, Data Engineering, or Applied ML Infrastructure;
Expert-level Python;
Strong experience with distributed processing frameworks (Apache Beam preferred; Spark/Flink acceptable);
Hands-on experience integrating LLMs in production (prompting, orchestration, provider abstraction);
Experience with graph databases (Dgraph, Neo4j, JanusGraph, or similar);
Strong understanding of distributed systems, parallel processing, and large-scale data pipelines;
Solid SQL and data modeling fundamentals;
Experience with Docker and containerized deployments;
Experience with cloud platforms (GCP preferred, AWS/Azure acceptable);
Strong backend/system design mindset (performance, scalability, reliability);
Ability to work independently and take ownership without micromanagement;
Strong problem-solving skills in complex, ambiguous environments;
English level: B2+ (C1 preferred);
Availability to overlap with US business hours.
Nice to have:
Working experience with Go (for API/service layer contributions);
Experience with local LLM inference (Ollama, vLLM, TensorRT-LLM);
Experience deploying systems in airgapped or restricted environments;
Background in defense, aerospace, or complex enterprise systems;
Familiarity with MBSE / SysML tooling and model-based data (XMI, Cameo, etc.);
Experience with document parsing pipelines (Docling, Unstructured, etc.);
Experience with orchestration tools (Airflow, Dagster, Prefect).
We offer:
Four-month full-time contract engagement;
Remote work with overlap in US business hours;
High-impact project with direct influence on production AI systems;
Close collaboration with client’s product and engineering teams;
Opportunity to build core infrastructure for real-world AI applications — not prototypes.