Site Reliability Engineer

We are currently seeking an experienced Site Reliability Engineer to join our dynamic Platform Tribe.

 

Key Responsibilities:

  • Manage day-to-day alerts, system checks, and issue escalation as necessary.
  • Provide 24x7 on-call support for critical SaaS events.
  • Document issues and remediation steps.
  • Proactively create monitors within the EKS/K8s ecosystem.
  • Deploy to EKS/K8s cluster using Terraform and Helm/Flux.
  • Enhance infrastructure health by implementing checks and scripts to address known issues.
  • Maintain and develop deployment code.
  • Implement/integrate new technologies into our Cloud Infrastructure.
  • Collaborate with other teams to provide top-notch support and assistance.
  • Prioritize customer focus in planning deployments/updates, ensuring minimal impact.
  • Conduct RCA and take necessary corrective actions to prevent issue recurrence.
  • Assign alert-related actions to the appropriate team after investigation.
  • Handle support requests for environment-specific actions.

 

Requirements:

  • Strong experience with issue processing (RCA, Postmortems).
  • Proficiency in Kubernetes (deployment, scaling, troubleshooting).
  • Familiarity with AWS, Terraform, Docker, CI/CD.
  • Experience with monitoring tools like DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch.
  • Strong understanding of networking concepts and protocols.
  • Proficiency in at least one scripting language (e.g., Python, NodeJS, Go).
  • Experience with configuration management tools like FluxCD/ArgoCD.
  • Proficiency in Git or other version control systems.
  • Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps.
  • Ownership, proactiveness, persistence, and passion for maintaining a high-traffic online platform.

 

What We Offer:

  • Quarterly Bonuses based on transparent and systematic evaluation.
  • Flexible Work Schedule.
  • Remote Work Option for Enhanced Flexibility.
  • Comprehensive Medical Insurance for you and your significant other.
  • Financial Support for Life Events.
  • Unlimited Paid Vacation.
  • Unlimited Paid Sick Leave.
  • Reimbursement for professional development courses and training.

 


 

Published 21 May
112 views
ยท
18 applications
100% read
ยท
100% responded
Last responded 5 days ago
To apply for this and other jobs on Djinni login or signup.
Loading...