SRE (Site Reliability Engineer) Offline
We work with iGaming operators of all sizes to help them expand into new markets or strengthen their existing brands. As a casino software company, we provide solutions that help build outstanding brands and achieve business goals. And now we are looking for a SRE to join our team.
What will be your responsibilities:
• Ensure reliability, availability, and performance of systems and services.
• Define and maintain SLOs and SLIs, ensuring system reliability meets business needs.
• Develop and improve observability (monitoring, logging, tracing).
• Optimize system performance and scalability in collaboration with DevOps and development teams.
• Manage and improve CI/CD pipelines for deployment and infrastructure.
• Conduct root cause analysis and implement proactive solutions to minimize downtime.
• Implement incident management processes and ensure fast resolution of critical issues.
• Perform capacity planning and performance tuning for infrastructure scalability.
• Optimize infrastructure costs while maintaining performance and reliability.
• Participate in post-incident reviews and contribute to a blameless culture.
• Educate and mentor teams on best practices in reliability engineering.
What we expect from you:
• 1-2+ years of experience in a similar SRE, DevOps, or Linux System Administrator role (Junior+ or Middle level).
• Experience with cloud technologies: AWS.
• Experience with bare-metal environments: Linux servers.
• Proficiency in monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack, NewRelic, Zabbix, server logs).
• Basic coding skills in PHP, Bash, or Python.
• Deep understanding of Linux systems and networking or AWS/Cloud
• Experience with databases such as MySQL, PostgreSQL, and Aurora.
• Strong troubleshooting and problem-solving skills.
• Excellent communication and collaboration abilities.
Experience with SRE methodologies: SLO, SLI, SLA, error budgets.
Will be a plus:
• Understanding of incident management (e.g., PagerDuty, Opsgenie, VictorOps).
• Experience with distributed tracing (e.g., OpenTelemetry, Jaeger, Zipkin).
• Understanding of CI/CD processes (e.g., Jenkins, GitLab, GitHub Actions).
• Knowledge of self-healing mechanisms and auto-remediation strategies.
• Understanding of security best practices (e.g., HashiCorp Vault, AWS Secrets Manager).
• Experience with automation of infrastructure management and deployment processes.
What we offer:
- Working hours: 09:00/10:00 to 17:00/18:00 (Kyiv time) (Mon-Fri);
- Remote work format;
- Timely payment of wages and official employment;
- A friendly team and a pleasant atmosphere without pressure, stress, and other negative things.
We believe in the importance of unlocking the inner potential of each team member, and we have an open and democratic system of work organization.
We are waiting for you on our team!
The job ad is no longer active
Look at the current jobs DevOps →