OnDuty Engineer (Sysadmin)

$$
Product

We need an on-call engineer to monitor and support our service. This is a front-line position: you work strictly according to runbooks, documenting everything that happens, and escalating non-standard situations to a Senior Operations Engineer. The key requirements for this role are attentiveness, discipline, and the ability to clearly describe the problem.

Tasks

  • Continuous monitoring of services and infrastructure
  • Responding to alerts strictly according to runbooks
  • Initial incident diagnostics: checking availability, logs, and service status
  • Escalation to a Senior Operations Engineer if runbook scope is exceeded
  • Maintaining event and incident logs; providing timely status updates


Requirements

  • Linux, command line โ€” SSH, log navigation (journalctl, tail, grep), service management (systemctl), basic load and disk space diagnostics (top/htop, df, du)
  • Network, basic โ€” host and port availability checks (ping, curl, nc/telnet), understanding DNS, assessing whether a service is alive or not
  • Infrastructure, basic โ€” understanding the difference between a physical host and a VM; understanding out-of-band access (IPMI/BMC); basic familiarity with the cloud console (instance status, metrics)
  • Monitoring and Dashboards โ€” reading metrics and graphs (Grafana or similar), understanding alerts, severity, and thresholds; Ability to distinguish a real incident from a false positive
  • NGINX โ€” reading configs, working with logs, restarting
  • MySQL โ€” basic read-only queries, checking replication, reading slow logs
  • Docker / Docker Compose โ€” container status, reading logs, restarting, basic reading of compose files
  • Working with LLM assistants (Claude, Cursor, etc.) โ€” using them for diagnostics, finding solutions, and documentation
  • English for reading technical documentation and alerts
  • Ability to clearly and concisely describe a problem in writing
  • at least 1 year of experience in a sysadmin, support, or operations role


Nice to have

  • Physical server administration: IPMI / iDRAC / iLO (remote reset, console access, hardware testing)
  • Hypervisors: KVM / Proxmox / VMware or similar โ€” VM lifecycle management
  • Clouds โ€” GCP, AWS, Azure, Yandex Cloud: instances, disks, networks, metrics, and logs in the console
  • On-call systems: PagerDuty, OpsGenie, or similar
  • Understanding Prometheus-style monitoring (probe, metric, alert rules)

Required languages

Russian Native
Published 2 June
21 views
ยท
4 applications
To apply for this and other jobs on Djinni login or signup.
Loading...