MLOps Engineer

We are seeking an experienced MLOps Engineer for one of our project, to drive the optimization and scalability of machine learning infrastructure. You will play a critical role in building and maintaining robust ML pipelines, ensuring seamless model deployment, and implementing best practices for ML operations at scale.

Key Responsibilities
 

Model Deployment & Infrastructure

  • Optimize Amazon SageMaker model deployments for performance, cost-efficiency, and reliability
  • Design and implement auto-scaling strategies for ML inference endpoints to handle varying traffic loads
  • Build and maintain AWS-based ML infrastructure using best practices for security, monitoring, and cost optimization

CI/CD & Automation

  • Enhance and maintain GitHub Actions workflows for automated ML model building, testing, and deployment
  • Implement automated quality gates and validation checks in the ML deployment pipeline
  • Develop infrastructure-as-code solutions for consistent and reproducible ML environments

Inference Optimization

  • Research, experiment with, and implement various inference methods including:
    • Asynchronous inference for high-throughput scenarios
    • Batch inference for large-scale data processing
    • Real-time inference optimization
  • Analyze and optimize inference latency, throughput, and cost trade-offs

Training & Model Lifecycle Management

  • Design and implement automated training and re-training pipelines
  • Build systems for continuous model improvement based on new data and performance metrics
  • Manage model versioning, artifact storage, and deployment rollback strategies

Model Monitoring & Quality Assurance

  • Implement comprehensive model evaluation frameworks and automated testing
  • Set up model drift detection systems to monitor data and concept drift
  • Build alerting and monitoring systems for model performance degradation
  • Develop automated model validation and A/B testing capabilities

    Required Qualifications
     
  • Experience: 3+ years of hands-on experience in MLOps, DevOps, or related fields
  • Cloud Platforms: Strong expertise with AWS services, particularly Amazon SageMaker, EC2, S3, Lambda, and CloudWatch
  • Programming: Proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch, scikit-learn)
  • CI/CD: Hands-on experience with GitHub Actions, Docker, and containerization technologies
  • Infrastructure: Experience with Infrastructure as Code tools (Terraform, CloudFormation, or CDK)
  • ML Lifecycle: Understanding of the complete ML lifecycle from data ingestion to model deployment and monitoring

    Technical Skills
     
  • Cloud Services: AWS SageMaker, EC2, S3, Lambda, CloudWatch, IAM, VPC
  • Containerization: Docker, Kubernetes, Amazon EKS
  • Programming Languages: Python, Bash/Shell scripting
  • ML Tools: MLflow, Weights & Biases, Amazon SageMaker Studio
  • CI/CD Tools: GitHub Actions, Jenkins, GitLab CI
  • Infrastructure: Terraform, AWS CloudFormation, AWS CDK
  • Monitoring: CloudWatch, Prometheus, Grafana
  • Databases: Experience with both SQL and NoSQL databases

Required languages

English B2 - Upper Intermediate
Published 16 September
7 views
ยท
2 applications
50% read
ยท
0% responded
To apply for this and other jobs on Djinni login or signup.
Loading...