Site Reliability Engineer (Azure) $$$

ELEKS Verified Employer

ELEKS is looking for a Site Reliability Engineer (Azure).

 

ABOUT CLIENT

The Shared Support team is a specialized, advanced group that provides high-quality support services to various small and midsize organizations worldwide. Our department’s culture prioritizes proactiveness, professionalism, transparency, continuous learning, flexibility, and respect. We aim to establish long-term relationships with our customers, offering dependable support and development practices to improve system reliability and adapt to evolving business needs.

 

REQUIREMENTS

  • 4+ years of relevant experience
  • Strong knowledge of Azure
  • Hands-on experience with maintaining SQL databases
  • Experience with Kubernetes (deployment, scaling, maintenance)
  • Experience with Linux administration
  • Experience with monitoring tools (CloudWatch, Datadog)
  • Web server configuration skills (Nginx, Apache)
  • Experience with CI/CD
  • Strong network troubleshooting skills
  • Experience with Terraform or CloudFormation (as a plus)
  • CI/CD pipeline. Jenkins, GitLab CI, or similar tools (as a plus)
  • Upper-Intermediate English, both written and spoken

 

RESPONSIBILITIES

  • Manage complex tickets for timely resolution in accordance with SLA
  • Investigate and replicate customer-reported issues, collaborating with cross-functional teams for resolution
  • Proactively install, configure, and monitor IT systems, performing daily maintenance tasks
  • Provide daily maintenance: backup and restore, new version deployment and roll back, periodic system cleanup, OS and components upgrade, security patching, etc
  • Carry out task automation and continuous improvement, ensuring software system reliability
  • Detect, diagnose, and resolve incidents promptly, and conduct post-incident analysis for continuous improvement.
  • Implement and maintain robust monitoring and alerting systems, setting up alerts for proactive response. (APM, DB monitoring, and K8s)
  • Continually analyze system performance in EKS for efficiency, optimizing application code, database queries, and infrastructure settings
  • Use Infrastructure as Code (IaC) tools for defining and provisioning infrastructure, ensuring consistency and reproducibility.
  • Create and maintain the solution documentation, templates, runbooks, and DRPs
  • Be available for on-call shifts and ready to help restore the system if such a need arises (compensated with day off or overtime compensation for the dispatch)
  • Conduct knowledge sharing for junior staff


 

Required skills experience

Azure 2 years
Kubernetes 2 years
MySQL 2 years
Linux 2 years
CI/CD 2 years
Terraform 2 years

Required languages

English B2 - Upper Intermediate
Ukrainian Native
Published 1 April
15 views
·
1 application
Last responded 50 minutes ago
To apply for this and other jobs on Djinni login or signup.
Loading...