Lead Site Reliability Engineer
Description
As a SRE Lead, you will design, implement, and maintain our observability cloud infrastructure and infrastructure platform. You will also work closely with development teams to streamline deployment processes and ensure system reliability.
We are looking forward to your resume!
Requirements
- 5+ years of experience in DevOps/SRE roles;
- Strong experience with AWS cloud services;
- Advanced knowledge of Kubernetes and container orchestration;
- Solid understanding of DevOps principles and practices;
- Experience with Helm chart development and maintenance;
- Strong proficiency in monitoring, logging, alerting, cloud, platform, OS, CI/CD, repo storage, and management tools;
- Strong scripting skills (Bash, Python, or similar);
Excellent problem-solving and communication skills.
Responsibilities
- Manage SRE teams;
- Technical excellence of teammates;
- Implement and maintain monitoring solutions using Prometheus, Victoria-Metrics, and Grafana to identify and address performance issues proactively;
- Manage logging infrastructure using Fluent, Fluent-bit, ElasticSearch, and Kibana, ensuring efficient log collection, analysis, and visualization;
- Configure and manage alerting systems like AlertManager and Opsgenie to respond to critical incidents and minimize downtime promptly;
- Control utilization of AWS Cloud services and design, deploy and manage scalable and highly available infrastructure;
- Expertise in AWS services such as EC2, VPC, CloudWatch, and IAM to ensure optimal performance and security of our cloud-based applications;
- Deploy and manage containerized applications using Kubernetes, Docker, and Helm, ensuring smooth orchestration and scalability;
- Proficient in the Debian operating system, with the ability to troubleshoot and optimize system performance;
- Implement and manage CI/CD pipelines using Jenkins and ArgoCD for seamless software delivery and infrastructure automation;
- Manage code repositories using GitLab and Git, ensuring version control and collaboration among team members;
- Collaborate with cross-functional teams using Jira and Confluence for effective project management and knowledge sharing.
Nice to Have - Experience with other cloud providers (GCP, Azure);
- Security certifications (AWS, CKS, etc.);
- Experience with service mesh technologies.
Benefits
Why Join Us?
π° Be part of the international iGaming industry β Work with a top European solution provider and shape the future of online gaming;
π A Collaborative Culture β Join a supportive and understanding team;
π° Competitive salary and bonus system β Enjoy additional rewards on top of your base salary;
π Unlimited vacation & sick leave β Because we prioritize your well-being;
π Professional Development β Access a dedicated budget for self-development and learning;
π₯ Healthcare coverage β Available for employees in Ukraine and compensation across the EU;
π« Mental health support β Free consultations with a corporate psychologist;
π¬π§ Language learning support β We cover the cost of foreign language courses;
π Celebrating Your Milestones β Special gifts for lifeβs important moments;
β³ Flexible working hours β Start your day anytime between 9:00-11:00 AM;
π’ Flexible Work Arrangements β Choose between remote, office, or hybrid work;
π₯ Modern Tech Setup β Get the tools you need to perform at your best;
π Relocation support β Assistance provided if you move to one of our hubs.
Required domain experience
| Gambling | 1.5 years |
| Gamedev | 1.5 years |
| Security | 1.5 years |
Required languages
| English | B2 - Upper Intermediate |
| Ukrainian | Native |