Principal Site Reliability Engineer

$$$$

Are you ready to lead infrastructure strategy for a cutting‑edge AI‑driven SaaS platform? We are looking for a Principal Site Reliability Engineer with a proven track record in scaling, optimizing, and securing cloud‑based systems. This senior role offers the opportunity to shape the reliability and performance of a platform used by finance teams worldwide.

In this role, you will be part of a dynamic engineering environment where your expertise will directly influence product stability and growth. You will work with advanced cloud technologies, automation tools, and AI-driven solutions, contributing to projects that push the boundaries of innovation.

If you are ready to take on strategic responsibility and make a tangible impact, apply now and join us in building the future of reliable, scalable systems.

Customer

Sigma Software is partnering with a fast‑growing AI‑driven SaaS platform serving finance and accounting teams in high‑growth businesses. The platform automates critical workflows — from billing and collections to revenue recognition and reporting, ensuring compliance and accelerating cash flow. Leveraging advanced AI, it reduces manual work, increases operational efficiency, and supports scalability for customers worldwide.

Project

The project focuses on building and scaling an AI-powered SaaS solution for finance automation. It integrates advanced machine learning models with robust cloud infrastructure to deliver secure, compliant, and high‑performance services. The engineering culture emphasizes automation, resilience, and operational excellence.

Requirements

At least 8 years of experience in Site Reliability Engineering or DevOps roles, including 2+ years in a Principal or Lead position
Proven experience in infrastructure modernization and scaling initiatives for high‑growth environments
Strong proficiency in Python
Deep expertise in cloud platforms and container orchestration tools such as AWS ECS and EKS
Solid experience in CI/CD pipeline design and optimization using tools like GitHub Actions and Buildkite
Proficiency in infrastructure‑as‑code tools such as Terraform
Strong knowledge of monitoring, observability, and performance optimization practices
Upper-Intermediate level of spoken and written English

Would be a plus:

Experience with monorepos (Turborepo, pnpm)
Familiarity with modern TypeScript tools (swc, biome, oxc)
Knowledge of NestJS, NextJS, and testing frameworks (Jest, Vitest)

Personal Profile

Excellent leadership, communication, and decision‑making abilities
Ability to work independently and make pragmatic build‑vs‑buy decisions in fast‑paced environments

Responsibilities

Define and lead infrastructure and reliability strategy across the platform
Design scalable, resilient systems in collaboration with engineering teams
Optimize build, testing, and deployment processes for speed and stability
Establish and uphold best practices for CI/CD, monitoring, and observability
Lead incident response and drive continuous improvement post‑incident
Automate workflows to reduce operational toil and risk
Mentor engineers and foster a culture of operational excellence
Make strategic build‑vs‑buy decisions balancing speed, quality, and sustainability

Готовий(-а) очолити інфраструктурну стратегію для передової AI-керованої SaaS-платформи? Ми шукаємо Principal Site Reliability Engineer із підтвердженим досвідом масштабування, оптимізації та забезпечення безпеки хмарних систем. Це старша позиція, яка дає можливість визначати надійність і продуктивність платформи, якою користуються фінансові команди по всьому світу.

У цій ролі ти станеш частиною динамічного інженерного середовища, де твій досвід безпосередньо впливатиме на стабільність і розвиток продукту. Ти працюватимеш із передовими хмарними технологіями, інструментами автоматизації та AI-рішеннями, беручи участь у проєктах, що розширюють межі інновацій.

Якщо ти готовий(-а) взяти на себе стратегічну відповідальність і зробити відчутний внесок — подавайся та приєднуйся до нас у створенні майбутнього надійних і масштабованих систем.

Замовник
Sigma Software співпрацює зі швидкозростаючою AI-керованою SaaS-платформою, яка обслуговує фінансові та бухгалтерські команди в компаніях із високими темпами зростання. Платформа автоматизує критично важливі бізнес-процеси — від виставлення рахунків і обробки платежів до визнання доходів і фінансової звітності, забезпечуючи відповідність вимогам і прискорюючи грошові потоки. Використовуючи передові AI-технології, вона зменшує обсяг ручної роботи, підвищує операційну ефективність і підтримує масштабованість для клієнтів по всьому світу.

Проєкт
Проєкт спрямований на розробку та масштабування AI-керованого SaaS-рішення для автоматизації фінансових процесів. Він поєднує передові моделі машинного навчання з надійною хмарною інфраструктурою, щоб забезпечити безпечні, відповідні нормативним вимогам і високопродуктивні сервіси. Інженерна культура робить акцент на автоматизації, відмовостійкості та операційній досконалості.

Вимоги

Мінімум 8 років досвіду в Site Reliability Engineering або DevOps, включно з 2+ роками на позиції Principal або Lead
Підтверджений досвід модернізації інфраструктури та масштабування у високодинамічних середовищах
Впевнене володіння Python
Глибокі знання хмарних платформ і інструментів оркестрації контейнерів, таких як AWS ECS та EKS
Досвід проєктування та оптимізації CI/CD пайплайнів із використанням GitHub Actions і Buildkite
Досвід роботи з Infrastructure as Code, зокрема Terraform
Ґрунтовні знання практик моніторингу, спостережуваності та оптимізації продуктивності
Рівень англійської — Upper-Intermediate (усний і письмовий)

Буде плюсом:

Досвід роботи з монорепозиторіями (Turborepo, pnpm)
Знання сучасних інструментів TypeScript (swc, biome, oxc)
Знання NestJS, Next.js та тестових фреймворків (Jest, Vitest)

Особистий профіль

Відмінні навички лідерства, комунікації та прийняття рішень
Здатність працювати самостійно та приймати прагматичні
рішення типу build-vs-buy у динамічних середовищах

Обов’язки

Визначати та очолювати стратегію інфраструктури й

надійності платформи

Проєктувати масштабовані та відмовостійкі системи у співпраці з інженерними командами
Оптимізувати процеси збірки, тестування та деплойменту для швидкості й стабільності
Впроваджувати та підтримувати найкращі практики CI/CD, моніторингу та спостережуваності
Очолювати реагування на інциденти та впроваджувати покращення за результатами постмортемів
Автоматизувати процеси для зменшення операційного навантаження та ризиків
Менторити інженерів і розвивати культуру операційної досконалості
Приймати стратегічні рішення build-vs-buy, балансуючи швидкість, якість і сталість

Required languages

English

B2 - Upper Intermediate

AWS, Terraform, Docker, Kubernetes, Python, DevOps, CI/CD, ArgoCD, npm, TypeScript

Published 6 May

107 views

20 applications

Response activity: High

Last responded 5 days ago

See stats of candidates who applied for this job 👀

See applicant insights

To apply for this and other jobs on Djinni login or signup.

Only from 8 years of experience
Full Remote
Countries of Europe or Ukraine
Countries where we consider candidates
- English B2 - Upper Intermediate

DevOps

Employment: Fulltime
Domain: Other
Outsource

Apply for the job

Response activity: High

Last responded 5 days ago

📊 $4000-6000 Average salary range of similar jobs in analytics →