Senior Site Reliability Engineer (SRE) โ€“ AWS and GCP

Client

Our client is revolutionizing the retail direct store delivery model by addressing key challenges like communication gaps, out-of-stocks, invoicing errors, and price inconsistencies. Through innovative technology and strong partnerships, they help boost sales, increase profits, and enhance customer loyalty.

 

Position overview

We are seeking a skilled Middle to Senior Site Reliability Engineer (SRE) with hands-on experience in both AWS and Google Cloud Platform (GCP) to join a fast-paced, innovative project team. This role requires proactive monitoring, automation, and optimization of cloud infrastructure to ensure high availability, scalability, and security of mission-critical retail solutions.

The candidate should be available for at least four hours of overlapping work time with the New York time zone to ensure smooth collaboration and participation in team activities.

 

Responsibilities

  • Design, build, and operate scalable and reliable systems on AWS and GCP cloud platforms
  • Develop and maintain automation scripts to improve deployment, monitoring, and incident response
  • Ensure system availability, latency, and overall reliability to meet service level objectives (SLOs)
  • Collaborate with development and operations teams to implement best practices for security, monitoring, and infrastructure management
  • Proactively troubleshoot and resolve infrastructure incidents and performance bottlenecks
  • Participate in on-call rotations and incident management processes
  • Continuously improve system architecture and automation to reduce manual intervention and improve efficiency
  • Support CI/CD pipelines and infrastructure as code (IaC) initiatives

 

Requirements

  • 4+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering roles
  • Strong hands-on experience with AWS services (EC2, S3, VPC, Lambda, CloudWatch, IAM, etc.)
  • Proven expertise with Google Cloud Platform (Compute Engine, GKE, Cloud Storage, IAM, Stackdriver, etc.)
  • Skilled in scripting and automation tools (Python, Bash, Terraform, Ansible, or similar)
  • Experience managing container orchestration platforms such as Kubernetes or GKE
  • Familiarity with CI/CD tools such as Jenkins, GitLab CI, or CircleCI
  • Solid understanding of networking, security best practices, and cloud infrastructure design
  • Comfortable working in agile, collaborative team environments
  • Excellent communication skills and ability to work with distributed teams
  • Availability for a minimum of 4 hours overlap with New York time zone for meetings and collaboration

Required skills experience

SRE 4 years
AWS 4 years
GCP (Google Cloud Platform) 4 years

Required languages

English B2 - Upper Intermediate
Published 5 November
70 views
ยท
5 applications
100% read
ยท
40% responded
Last responded 2 weeks ago
To apply for this and other jobs on Djinni login or signup.
Loading...