Site Reliability Engineer for the SaaS project Offline
Required Qualifications
3 to 5 years of strong, demonstrable experience in DevOps
Scripting and programming skills in Ruby or Python
Production experience with Kubernetes, Istio and Terraform
Experience deploying to and managing on AWS and/or Azure cloud
Experience with ELK is a plus
Experience with service and application provisioning, deployment, and orchestration
Capable of embracing and following guidance from team members
Detail-oriented with excellent written and verbal communication skills in English
Strong software and cloud computing security skills
Primary Responsibilities
Create, execute, and manage SRE (Site Reliability Engineering) process and procedures
Work closely with multiple teams to ensure that every software release meets security, availability, and performance requirements
Ensure all SaaS (Public and Private) services are up and running to maintain SLA
Create and optimize different incident plans that may occur across different platforms
Monitor platforms and assist engineers to build error-free and performative services
Debug production issues across all platforms, document, and deliver appropriate solution
Look for ways to simplify and automate cloud operations
Work closely with the internal development team to continually build and refine platforms
Troubleshoot quickly complex requests and issues
Identify opportunities for improved monitoring and alerting to detect issues more rapidly
The job ad is no longer active
Job unpublished on
19 April 2021
Look at the current jobs DevOps Kyiv→