Senior MLOps
Job Description
Technical Skills:
- Proficiency in Python for ML development; familiarity with additional languages like Clojure is a plus.
- Expertise in cloud platforms (AWS, GCP) and data warehouses like Snowflake or BigQuery.
- Strong knowledge of MLOps frameworks (e.g., Kubeflow, MLflow) and DevOps tools (e.g., Jenkins, GitLab, Flux)
- Experience with containerization (Docker) and orchestration (Kubernetes)
- Experience with infrastructure-as-code tools like Terraform
Machine Learning Knowledge:
- Solid understanding of machine learning principles, including model evaluation, explainability, and retraining workflows.
- Hands-on experience with ML frameworks such as TensorFlow or PyTorch
Big Data Handling:
- Proficiency in SQL/NoSQL databases and distributed computing systems like Dataprov, EMR, Spark, Hadoop
Soft Skills:
- Strong communication skills to collaborate across multidisciplinary teams.
- Problem-solving mindset with the ability to work in agile environments
Experience:
- At least 5+ years in platform, software, or MLOps engineering roles
- Proven track record of deploying scalable ML solutions in production environments
Job Responsibilities
Model Deployment and Operations:
- Deploy, monitor, and maintain machine learning models in production environments.
- Automate model training, retraining, versioning, and governance processes.
- Monitor model performance, detect drift, and ensure scalability and reliability of ML workflows
Infrastructure and Pipeline Management:
- Design and implement scalable MLOps pipelines for data ingestion, transformation, and model deployment.
- Build infrastructure-as-code solutions using tools like Terraform to manage cloud environments (AWS, GCP)
Collaboration with Teams:
- Work closely with data scientists to operationalize machine learning models.
- Collaborate with software engineers to integrate ML systems into broader platforms
Cloud and Big Data Expertise:
- Utilize cloud services from AWS, GCP, and Snowflake for scalable data storage and processing.
DevOps Integration:
- Implement CI/CD pipelines and automations to streamline ML model deployment.
- Use containerization tools like Docker and orchestration platforms like Kubernetes for scalable deployments
- Use Observability platforms to monitor pipeline and operational health of model production, delivery and execution
Department/Project Description
Our team consists of 100+ engineers, designers, data scientists, implementation, and product people, working in small inter-disciplinary teams closely with creative agencies, media agencies, and with our customers, to develop and scale our leading digital advertising optimization suite that delivers amazing outcomes for brands and audiences.
Our platforms are built with Python, React, and Clojure, are deployed using CI/CD, heavily exploit automation, and run on AWS, GCP, k8s, Snowflake, BigQuery, and more. We serve 9 petabytes and 77 billion objects annually, optimize thousands of campaigns to maximise ROI, and deliver 20 billion ad impressions across the globe. You’ll play a leading role in significantly scaling this further.
As our first Machine Learning Operations (MLOps) Engineer, you will play a pivotal role in bridging the gap between platform engineering, data science, and software engineering, building systems that drive the deployment, monitoring, and scalability of machine learning models. You will design and implement pipelines, automate workflows, and optimise model performance in training and production environments. You’ll lead the creation of process, implementation of tools, and creation of solutions mature how we integrate machine learning solutions into our production systems, while maintaining reliability, security, and efficiency. You’ll additionally play a leading role in driving continuous improvement in model lifecycle management, from development to deployment and monitoring.
Required languages
| English | B2 - Upper Intermediate |