Machine Learning Engineer (Real-Time Inference Systems)

We are looking for a highly skilled and independent Machine Learning Engineer to lead the design and development of next-generation real-time inference services. This role involves building the core engine that powers large-scale algorithmic decision-making, serving billions of daily requests with extremely tight latency and performance requirements.

You will work at the intersection of machine learning, large-scale backend engineering, and business logic, creating robust, high-availability services capable of supporting massive traffic and dynamic real-time decisions. This is an opportunity to own mission-critical systems in a high-performance environment.

Responsibilities

Lead the design and development of low-latency algorithmic inference services handling billions of requests per day.
Build and scale real-time decision-making engines, combining ML models with business logic under strict SLAs.
Work closely with data science teams to deploy models seamlessly into production.
Develop systems for model versioning, shadow deployments, and A/B testing in real time.
Ensure high availability, scalability, and observability across production services.
Continuously optimize latency, throughput, and cost-efficiency using modern tooling and performance techniques.
Collaborate cross-functionally with teams across Algo, Infrastructure, Product, Engineering, and Business.
Work independently and take ownership of solutions end-to-end.

Requirements

B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related field.
5+ years building high-performance backend or ML inference systems.
Strong expertise in Python and experience with low-latency serving frameworks (FastAPI, Triton, TorchServe, BentoML).
Experience with scalable architectures, message queues (Kafka, Pub/Sub), and asynchronous processing.
Deep understanding of ML model deployment, feature parity, and real-time monitoring.
Strong experience with cloud environments (AWS, GCP, OCI) and Kubernetes.
Hands-on experience with in-memory & NoSQL databases such as Redis, Aerospike, or Bigtable.
Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry), alerting, and system diagnostics.
Strong ownership mindset, ability to lead solutions independently.
Passion for performance, clean architecture, and building impactful production systems.

Required languages

English

C2 - Proficient

Published 17 November

20 views

2 applications

100% read

To apply for this and other jobs on Djinni login or signup.

Only from 6 years of experience
Full Remote
Countries of Europe or Ukraine
Countries where we consider candidates
English C2 - Proficient

Data Science

Employment: Fulltime
Domain: Advertising / Marketing
Product

Apply for the job

100% read

0% responded

📊 $4000-6500 Average salary range of similar jobs in analytics →