MLOps Engineer
We are seeking an experienced MLOps Engineer for one of our project, to drive the optimization and scalability of machine learning infrastructure. You will play a critical role in building and maintaining robust ML pipelines, ensuring seamless model deployment, and implementing best practices for ML operations at scale.
Key Responsibilities
Model Deployment & Infrastructure
- Optimize Amazon SageMaker model deployments for performance, cost-efficiency, and reliability
- Design and implement auto-scaling strategies for ML inference endpoints to handle varying traffic loads
- Build and maintain AWS-based ML infrastructure using best practices for security, monitoring, and cost optimization
CI/CD & Automation
- Enhance and maintain GitHub Actions workflows for automated ML model building, testing, and deployment
- Implement automated quality gates and validation checks in the ML deployment pipeline
- Develop infrastructure-as-code solutions for consistent and reproducible ML environments
Inference Optimization
- Research, experiment with, and implement various inference methods including:
- Asynchronous inference for high-throughput scenarios
- Batch inference for large-scale data processing
- Real-time inference optimization
- Analyze and optimize inference latency, throughput, and cost trade-offs
Training & Model Lifecycle Management
- Design and implement automated training and re-training pipelines
- Build systems for continuous model improvement based on new data and performance metrics
- Manage model versioning, artifact storage, and deployment rollback strategies
Model Monitoring & Quality Assurance
- Implement comprehensive model evaluation frameworks and automated testing
- Set up model drift detection systems to monitor data and concept drift
- Build alerting and monitoring systems for model performance degradation
- Develop automated model validation and A/B testing capabilities
Required Qualifications
- Experience: 3+ years of hands-on experience in MLOps, DevOps, or related fields
- Cloud Platforms: Strong expertise with AWS services, particularly Amazon SageMaker, EC2, S3, Lambda, and CloudWatch
- Programming: Proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch, scikit-learn)
- CI/CD: Hands-on experience with GitHub Actions, Docker, and containerization technologies
- Infrastructure: Experience with Infrastructure as Code tools (Terraform, CloudFormation, or CDK)
- ML Lifecycle: Understanding of the complete ML lifecycle from data ingestion to model deployment and monitoring
Technical Skills
- Cloud Services: AWS SageMaker, EC2, S3, Lambda, CloudWatch, IAM, VPC
- Containerization: Docker, Kubernetes, Amazon EKS
- Programming Languages: Python, Bash/Shell scripting
- ML Tools: MLflow, Weights & Biases, Amazon SageMaker Studio
- CI/CD Tools: GitHub Actions, Jenkins, GitLab CI
- Infrastructure: Terraform, AWS CloudFormation, AWS CDK
- Monitoring: CloudWatch, Prometheus, Grafana
- Databases: Experience with both SQL and NoSQL databases
Required languages
English | B2 - Upper Intermediate |
๐
$2000-5000
Average salary range of similar jobs in
analytics โ
Loading...