Junior+ Site Reliability Engineer $$

NextChallenge Top Employer

Role Overview:

Our client is expanding the engineering team responsible for ensuring the stability and predictable behaviour of their distributed services and platforms. This role involves hands-on production work, including monitoring, incident response, troubleshooting, and continuous improvements that increase platform reliability over time.

You will work as part of an SRE shift rotation covering late-evening and night hours, ensuring end-to-end ownership of incidents — from identifying user impact to post-incident follow-ups and preventive improvements.

Key Responsibilities:

— Working in shift-based operations: monitoring, alert response, incident handling, escalation when needed;

— Participating in incident handling: initial classification, technical investigation, coordination with engineering teams, and following-up improvements;

— Developing and refining observability across platforms (metrics/alerts, dashboards, logs);

— Reducing operational toil: small automation, runbooks, and repeatable processes (the “make it easier next time” mindset);

— Collaborating with development teams to improve production readiness (basic reliability practices, cleaner incident follow-ups).

Ideal profile for the position:

Core skills:

— Good Linux skills in production environments (debugging basics, system services, logs, performance basics);

— Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancing basics, TLS fundamentals);

— Experience with containers and image lifecycle basics (Docker or compatible runtimes);

— Ability to troubleshoot across application, network, and infrastructure layers using logs/metrics and simple tools (curl, basic traffic/log analysis; scripting is a plus);

— Basic familiarity with observability: metrics and alerting, dashboards, logging (any modern stack is fine).

Experience:

— 1+ year in a production-focused role (Ops / Support L2+ / DevOps / Junior SRE — what matters is real production exposure);

— Participation in production incidents (triage, investigation, escalation, basic follow-ups);

— Availability to cover late-evening and night shifts, in rotation.

SRE fundamentals (basic understanding):

— You understand the difference between “just running infra” and SRE as a discipline: reliability targets, fast detection, clear escalation, and consistent follow-up;

— You’re familiar with SLI/SLO and can explain them in simple words (high-level understanding is enough).

What will be an advantage:

— Familiarity with Kubernetes (deep production ownership is not required yet);

— Exposure to AWS services such as EC2, ALB/NLB, RDS, S3, and IAM basics;

— Exposure to Terraform and/or Ansible (small changes, basic understanding of principles);

— Experience working in high-availability environments where downtime actually matters.

The company guarantees you the following benefits:

— Global Collaboration: Join an international team where everyone treats each other with respect and moves towards the same goal;

— Autonomy and Responsibility: Enjoy the freedom and responsibility to make decisions without the need for constant supervision;

— Competitive Compensation: Receive competitive salaries reflective of your expertise and knowledge as our partner seeks top performers;

— Remote Work Opportunities: Embrace the flexibility of fully remote work, with the option to visit company offices that align with your current location;

— Paid Time Off: Prioritise work-life balance with paid vacation and sick leave days to prevent burnout;

— Career Development: Access continuous learning and career development opportunities to enhance your professional growth;

— Corporate Culture: Experience a vibrant corporate atmosphere with exciting parties and team-building events throughout the year;

— Referral Bonuses: Refer talented friends and receive a bonus after they successfully complete their probation period;

— Medical Insurance Support: Choose the right private medical insurance and receive compensation (full or partial) based on the cost;

— Flexible Benefits: Customise your compensation by selecting activities or expenses you'd like the company to cover, such as a gym subscription, language courses, Netflix subscription, spa days, and more;

— Education Foundation: Participate in a biannual raffle for a chance to learn something new unrelated to your job as part of your commitment to ongoing education.

Interview process:

— A 30-minute interview with a Recruiter to get to know you and your experience;

— 1st stage of technical interview (1 h) with the DevOps team to assess your theoretical skills;

— 2nd stage of technical interview (1 h) with the DevOps team to assess your hard skills;

— A final interview to gauge your fit with the company culture and working style.

If you find this opportunity right for you, don't hesitate to apply or get in touch with us if you have any questions!

Required languages

English

B2 - Upper Intermediate

Published 28 January · Updated 2 March

296 views

39 applications

Response activity: High

Last responded yesterday

To apply for this and other jobs on Djinni login or signup.

Only from 1 year of experience
Full Remote
Worldwide
Countries where we consider candidates
- English B2 - Upper Intermediate

DevOps

Employment: Fulltime
Domain: Gambling
Product

Apply for the job

Response activity: High

Last responded yesterday

📊 Average salary range of similar jobs in analytics →