Site Reliability Engineer/DevOps (offline)

We are looking for a Site Reliability Engineer. It is a position in the Platform Tribe, SRE Stream, FireX Squad, responsible for the automation and high-load infrastructure maintenance.

To succeed in the advertised role, you have:
― Strong experience with issues processing (RCA, Postmortems practices).
― Strong understanding of Kubernetes (K8s) β€” Including deployment, scaling, troubleshooting, and managing containerized applications.
― Proficiency in AWS services β€” Specifically, expertise in Amazon Elastic Kubernetes Service (EKS), EC2, RDS, CloudFront, and other relevant services.
― Infrastructure as Code (IAC) β€” Terraform must have
― Containerization technologies β€” Knowledge of Docker, including creating and managing Docker images and containers.
― CI/CD β€” Familiarity with continuous integration and continuous deployment tools like Jenkins, GitLab CI/CD, or GitHub Actions.
― Monitoring and observability β€” Experience with monitoring tools like DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch.
― Networking β€” Strong understanding of network concepts like DNS, load balancing, and firewalls, as well as network protocols like TCP/IP, HTTP, and HTTPS, and gRPC as a big plus.
― Scripting and programming languages β€” Proficiency in at least one scripting language (e.g., Python, NodeJS, Go).
― Configuration management β€” Experience with tools like FluxCD/ArgoCD.
― Version control systems β€” Proficiency in using Git or other version control systems.
― Incident management β€” Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps.
― Strong problem-solving and troubleshooting skills β€” The ability to diagnose and resolve complex technical issues.
― Strong ownership, proactiveness, persistence, and passion for maintaining one of the biggest online gambling platforms

Would be beneficial to know:
― FluxCD/ArgoCD
― Ticket systems: Jira
― Understanding of event-driven architecture
― Understanding of ITIL Frameworks
― Security best practices β€” Knowledge of security principles, including securing applications, infrastructure, and data
― Cilium
― Terragrunt experience

The importance of the role is in:
― Day-to-day management of alerts, checking systems, and escalating issues as necessary.
― Be part of a team that provides 24Γ—7 on-call support for critical SaaS events.
― Available in case of emergencies when team members are not available or need help.
― Documentation of issues and remediation steps.
― Proactively create appropriate monitors in the EKS/K8S ecosystem.
― Deploy to EKS/K8s cluster using Terraform and Helm/Flux.
― Improve existing infrastructure health by implementing checks and scripts to correct known issues.
― Maintenance and development of deployment code.
― Implement/integrate new technologies in our Cloud Infrastructure.
― Collaborate with other teams and departments to provide the highest level of support and assistance.
― Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes.
― Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers.
― Perform RCA and take necessary corrective actions to prevent the recurrence of issues.
― Create and assign alert-related actions to the appropriate team after the investigation.
― Handle support requests for environment-specific actions.

What you get in return:
πŸ† Competitive market salary
πŸ† Performance-based bonus system
πŸ† Paid vacation leave
πŸ† Paid sick leave
πŸ† Special Life Event financial support
πŸ† Autonomy in terms of flexible schedule and absence of micromanagement
πŸ† Remote work
πŸ† Health insurance
πŸ† Team of enthusiastic and ambitious professionals
πŸ† Sponsored professional trainings when applicable
πŸ† Employee Referral bonus program
πŸ† Dynamic company with a massive plan for further growth
Enough has been said β€” now let’s talk about how you can contribute to the success of the Platform Tribe and Playson. Apply now!

About Playson

🀘 Playson is a B2B game provider with 10+ years of experience on the market.
Since 2012 we have ambitiously developed worldwide recognition in the industry. Nowadays, our main focus is on European Markets and we operate in 20+ different jurisdictions.

Playsoners hit and reach their targets from Ukraine, Bratislava, Malta as well as other EU countries on a remote basis. We do not limit ourselves and always seek better solutions.

πŸ‡ΊπŸ‡¦ Playson vs military russian invasion in sovereign Ukraine
We have always been supportive through variant political and social-economic disasters. In response to the unprecedented military invasion of Ukraine by russian federation, Playson is on the mission to help Ukrainian Military Forces, local volunteers, cyber forces community to fight back and protect its sovereignty by all possible means.

In the meantime, safety of our employees and their families remains of high priority for us. We have launched a special social package program aimed to:
➟ Relocate employees and their families to safe places in western Ukraine
➟ Support financially such employees in Ukraine
➟ Launch few location points sponsored by Playson, so that our employees and their families could be staying in the safe place with all the amenities
➟ Establish new hub in Slovakia, EU
➟ Relocate employees and/or their families by their own will to our new hub in Slovakia with local legal-finance guidance
➟ Help those willing to volunteer to combine it with work with no financial loss in their income
➟ Support mental health by having a member of the Ukrainian Association of Psychoanalysis available for online 1x1 consultations

We stand by Ukraine!
ВсС Π±ΡƒΠ΄Π΅ Π£ΠΊΡ€Π°Ρ—Π½Π° πŸ’™πŸ’›

Company website:
https://playson.com/

DOU company page:
https://jobs.dou.ua/companies/playson/

The job ad is no longer active

Look at the current jobs DevOps →