DevOps Engineer โ€” AWS / ML Infrastructure

We are seeking a skilled and proactive DevOps Engineer to join our team. This role is focused on developing and managing scalable infrastructure and deployment workflows in AWS to support data-driven and machine learning applications. 

You will play a key role in building cloud-native systems with a strong emphasis on infrastructure as code, containerization, and CI/CD pipelines. 

 

A solid understanding of AWS services and Python is essential, particularly for authoring infrastructure using AWS CDK. Experience with SageMaker and knowledge of ML systems is a strong advantage.

 

Qualifications:

  • 3โ€“5 years of experience in DevOps, cloud infrastructure, or SRE roles.
  • Proficient in AWS services, especially CDK, Lambda, EC2, S3, SageMaker, and CloudWatch.
  • Experience with Infrastructure as Code (IaC) tools like Terraform or CloudFormation.
  • Strong experience with Python for scripting and infrastructure automation.
  • Hands-on experience with containerization (Docker).
  • Experience building and maintaining CI/CD pipelines.

 

Preferred Qualifications:

  • AWS Certifications (e.g., DevOps Engineer, Solutions Architect, or Machine Learning Specialty).
  • Background in software engineering or ML/AI infrastructure is a plus.

 

Key Responsibilities:

Infrastructure Development & Automation:

  • Design, provision, and manage AWS infrastructure using AWS CDK and CloudFormation.
  • Develop secure, scalable, and cost-efficient infrastructure to support machine learning and analytics workloads.
  • Implement and manage cloud-native services such as EC2, ECS, Lambda, S3, RDS, SageMaker, and Bedrock.
  • Ensure best practices for security, compliance, and disaster recovery are followed.

CI/CD & Deployment Automation:

  • Design and maintain CI/CD pipelines for application and model deployment using tools like CodePipeline, CodeBuild, GitHub Actions, or similar.
  • Automate testing, deployment, and rollback procedures to support continuous integration and delivery.

Containerization & Orchestration:

  • Build and manage Docker containers for microservices and ML applications.
  • Support deployment on ECS or Lambda with container-based runtimes.
  • Implement image build, versioning, and artifact management workflows.

Machine Learning & Model Operations Support:

  • Collaborate with ML engineers to deploy, monitor, and maintain models in SageMaker.
  • Integrate infrastructure for pre-processing, inference, and retraining pipelines.
  • Support model performance monitoring, logging, and metrics collection.

Monitoring, Observability & Logging:

  • Set up monitoring and alerting using CloudWatch, DataDog, and other observability tools.
  • Troubleshoot and resolve infrastructure, deployment, and performance issues proactively.

Collaboration & Documentation:

  • Work closely with software, ML, and data teams to support DevOps best practices across the ML lifecycle.
  • Maintain clear documentation for infrastructure, deployments, and operations processes.
  • Participate in code reviews and architectural discussions.
Published 16 June
67 views
ยท
13 applications
100% read
ยท
62% responded
Last responded yesterday
To apply for this and other jobs on Djinni login or signup.
Loading...