Chaos Theory

Senior DevOps Engineer

We are looking for a Senior DevOps Engineer to own and evolve our infrastructure, deployment pipelines, and reliability practices. This is a hands-on engineering role guided by Site Reliability Engineering (SRE) principles, focused on automation, operational excellence, and proactive risk reduction โ€” within the governance and compliance expectations of the industry.


Key Responsibilities

 

Infrastructure and Reliability

  • Design, operate, and improve AWS-based infrastructure across multiple regions and environments
  • Manage infrastructure using Infrastructure as Code (Pulumi with TypeScript)
  • Apply SRE principles to reduce operational toil and enhance observability and alerting
  • Improve system resilience and failure isolation per partner environment
  • Support a rapidly increasing number of isolated deployments as each partner integration results in multiple independently deployed backend services

 

CI/CD and Deployment Engineering

  • Own and evolve CI/CD pipelines using GitHub Actions
  • Build scalable, multi-tenant deployment workflows that support automated promotion, failover, and rollbacks
  • Maintain blue-green deployments and improve rollback automation
  • Support high-availability deployments for internal and partner-facing tools and dashboards
  • Work within a peer-reviewed change process and long-term production approval requirements

 

Observability and Incident Management

  • Maintain and improve monitoring, logging, and alerting systems
  • Ensure alerts are actionable and aligned with real player impact
  • Design and formalise incident response processes
  • Lead and participate in blameless postmortems with shared ownership of follow-ups

 

Security, Compliance and Cost

  • Support secure system design, secrets management, and access control
  • Enable and integrate security scanning and secure SDLC practices into CI/CD
  • Prepare systems for external security audits and compliance requirements
  • Own cloud cost awareness and optimisation โ€” FinOps is part of this role

 

Collaboration

  • Help establish shared responsibility for operational quality across the engineering team
  • Collaborate with technical leadership on prioritisation and long-term platform improvements
  • Contribute to long-term plans for scalability and operational robustness of core platform components

 

Must Have

  • 5+ years of experience in DevOps, Infrastructure, Platform, or SRE-adjacent roles
  • Strong hands-on experience with AWS in production environments
  • Practical, hands-on experience with Infrastructure as Code tools โ€” this is non-negotiable
  • Experience designing and operating CI/CD pipelines from scratch
  • Solid understanding and application of SRE concepts: alerting strategy, incident response, reliability vs velocity trade-offs, automation to reduce operational toil
  • Background supporting consumer-facing, revenue-generating platforms at real scale โ€” you understand what reliability truly means when money is on the line
  • Working knowledge of TypeScript or JavaScript
  • Some experience with distributed systems or message brokers (NATS, Kafka, etc.)

    You Are
  • Action-oriented โ€” you get things done through effective collaboration, not endless planning
  • An owner โ€” you take responsibility for complex systems with minimal oversight and engage the team proactively
  • A clear communicator โ€” you speak up, flag issues early, and translate technical problems for non-technical stakeholders without being asked
  • Decisive โ€” you exercise good judgment on when to refactor vs when to ship
  • Comfortable in a fast-moving startup environment where the roadmap evolves and autonomy is expected

Required languages

English C2 - Proficient
Published 17 March
25 views
ยท
1 application
To apply for this and other jobs on Djinni login or signup.
Loading...