Machine Learning Engineer (Real-Time Inference Systems)
We are looking for a highly skilled and independent Machine Learning Engineer to lead the design and development of next-generation real-time inference services. This role involves building the core engine that powers large-scale algorithmic decision-making, serving billions of daily requests with extremely tight latency and performance requirements.
You will work at the intersection of machine learning, large-scale backend engineering, and business logic, creating robust, high-availability services capable of supporting massive traffic and dynamic real-time decisions. This is an opportunity to own mission-critical systems in a high-performance environment.
Responsibilities
- Lead the design and development of low-latency algorithmic inference services handling billions of requests per day.
- Build and scale real-time decision-making engines, combining ML models with business logic under strict SLAs.
- Work closely with data science teams to deploy models seamlessly into production.
- Develop systems for model versioning, shadow deployments, and A/B testing in real time.
- Ensure high availability, scalability, and observability across production services.
- Continuously optimize latency, throughput, and cost-efficiency using modern tooling and performance techniques.
- Collaborate cross-functionally with teams across Algo, Infrastructure, Product, Engineering, and Business.
- Work independently and take ownership of solutions end-to-end.
Requirements
- B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related field.
- 5+ years building high-performance backend or ML inference systems.
- Strong expertise in Python and experience with low-latency serving frameworks (FastAPI, Triton, TorchServe, BentoML).
- Experience with scalable architectures, message queues (Kafka, Pub/Sub), and asynchronous processing.
- Deep understanding of ML model deployment, feature parity, and real-time monitoring.
- Strong experience with cloud environments (AWS, GCP, OCI) and Kubernetes.
- Hands-on experience with in-memory & NoSQL databases such as Redis, Aerospike, or Bigtable.
- Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry), alerting, and system diagnostics.
- Strong ownership mindset, ability to lead solutions independently.
- Passion for performance, clean architecture, and building impactful production systems.
Required languages
| English | C2 - Proficient |
Published 17 November
20 views
ยท
2 applications
100% read
๐
$4000-6500
Average salary range of similar jobs in
analytics โ
Loading...