SRE Engineer

We are looking for a skilled Site Reliability Engineer (SRE) to join our team and help build and maintain reliable, scalable, and efficient infrastructure solutions. This is an exciting opportunity to work in a fast-paced environment where your expertise will directly impact the performance and stability of our applications and services.

Responsibilities:

  • Collaborate with R&D engineers to coordinate and execute production-related operations.
  • Design, implement, and maintain scalable and reliable infrastructure solutions.
  • Develop and deploy monitoring, alerting, and logging systems to proactively identify and mitigate operational issues.
  • Build an SRE dashboard with KPIs to measure application reliability.
  • Conduct capacity planning and performance tuning to optimize system performance and resource utilization.
  • Automate repetitive tasks and processes to improve efficiency and reduce operational overhead.
  • Participate in incident response and resolution, including root cause analysis and post-mortem reviews.
  • Continuously evaluate and adopt new technologies to enhance infrastructure and operations.
  • Create and maintain documentation, runbooks, and knowledge base articles to document system configurations, procedures, and best practices.

Requirements:

  • 5+ years of experience as an SRE engineer with a strong passion for technology and highly reliable solutions.
  • Proven experience managing large-scale distributed systems, with a strong understanding of scalability and reliability principles.
  • In-depth knowledge of operating systems, networking, and cloud services.
  • Experience with observability and monitoring tools (mandatory).
  • Hands-on experience with Git, Virtualization, Containers, Docker, and Kubernetes.
  • Experience with cloud providers (GCP preferred; AWS or Azure is a plus).
  • Proficiency in programming languages such as Python, Go, or Java.
  • Strong communication skills, both verbal and written, with the ability to adapt messaging for different audiences.
  • Ability to grasp new technologies quickly, prioritize tasks, and work on multiple responsibilities simultaneously.
  • Excellent problem-solving skills and the ability to thrive in a fast-paced, dynamic environment.

Nice to Have:

  • Experience with Infrastructure as Code (IaC) solutions.

If you’re passionate about building high-reliability systems and working in a dynamic environment, we’d love to hear from you!

Published 26 March
83 views
·
19 applications
37% read
·
0% responded
To apply for this and other jobs on Djinni login or signup.

Similar jobs

Countries of Europe or Ukraine
Countries of Europe or Ukraine