Senior Lead Sysadmin / NOC Tech Lead
Senior Lead Sysadmin / NOC Tech Lead
About the Role
We are looking for a technical lead who will manage a team of six L1 engineers, a multi-server architecture with a large hardware fleet and interconnected services, and the full cycle โ from deployment to monitoring and incident resolution.
Complex, rare, and non-standard incidents are yours to handle personally. But the strategic mission is broader โ to build a system where typical problems are resolved at the L1 level via runbook, without your involvement. Through documentation, mentorship, and continuous process improvement.
You maintain the full picture of the architecture: you understand service dependencies, participate in releases, and assess the risks of changes.
Responsibilities
- Handling complex incidents that exceed runbook scope (Manual Cases)
- Writing, updating, and reviewing runbooks for the L1 team
- Mentorship: helping on-call engineers work through incidents and grow their skills
- Participating in Change & Release processes: risk assessment, deployment support
- Maintaining and updating the Service List: service descriptions, dependencies, criticality
- Preparing Root Cause Analysis for significant incidents
- Interfacing with Development and Product teams during escalations
Requirements
- Linux โ deep knowledge: network stack, performance diagnostics, system tuning
- Docker / Docker Compose โ confident configuration, debugging, optimization
- NGINX, HAProxy โ setup, load balancing, SSL/TLS, upstream management
- MySQL โ replication, cluster configurations, backup/restore, query and schema optimization
- Redis โ architecture, diagnostics, failover and persistence configuration
- RabbitMQ โ understanding of queue model, diagnostics, recovery from failures
- Memcached โ configuration, diagnostics, optimization under load
- ClickHouse โ basic operations, diagnostics, reading query profiles
- PHP โ operations-level understanding: interpreter, configuration (php-fpm, php.ini), logs, basic debugging
- Monitoring and alerting โ configuring Nagios (NRPE/NCPA), Loki, Sentry; writing checks and alerting rules
- Git / GitLab / SVN โ understanding of VCS, working with pipelines, participation in release processes
- RAID โ understanding of Software and Hardware RAID, degraded array diagnostics
- LLM assistants (Claude, Cursor, etc.) โ confident use for complex problem analysis, runbook writing, and documentation automation
- Experience writing technical documentation and runbooks
- English for reading technical documentation and alerts
- 5+ years of experience as a sysadmin or DevOps engineer
Required languages
| Russian | Native |