Monitoring Team Lead (offline)

We combine our established expertise in creating comfortable, adjustable beds with the latest in sleep science, cutting-edge sensor technology, and data processing algorithms.

The SRE Support Engineer will support a DevOps pipeline for developers, QA, and Ops teams. This position will provide DevOps and Cloud Support primarily for AWS.

Requirements:

- 5+ years of work experience in DevOps/SRE role for AWS based SaaS or IoT product
- Experience in supporting developers of a cloud-based SaaS or IoT product
- Cloud operation experience with AWS (required) and Azure
Automation of Java software builds using Scala, Maven, GitHub, and Jenkins
- Deployment and configuration of services such as ECS / Docker, Aurora MySQL Server, Kafka/MSK, and ECS / Spark
- Automating software builds and deployments using Jenkins, Terraform, Ansible, Chef, CloudFormation, and other similar technologies
- Scripting languages such as Bash Shell and Python
- Experience in network management of both cloud (AWS VPC) and physical networks including subnetting, routing, ACLs, and VPN
- Working knowledge and experience in supporting security standards such as Cloud Security Alliance STAR and HIPAA for infrastructure security and auditing
- Monitoring using Cloud Watch, DataDog, SumoLogic, custom scripts, or similar service
- Strong experience in troubleshooting applications/systems issues and root cause analysis
- Strong interpersonal and communications skills
- English level – Upper-Intermediate

Responsibilities:

Collaborate with DevOps/Developers/QA to automate build, test, and deploy using AWS ECS, Docker, Jenkins, and scripting languages
Share responsibility for managing security and access control to Infrastructures, Applications, and Sensitive data.
Ensure high availability and perform disaster recovery when needed
Own incidents/issues, provide a response to incidents and alerts
Troubleshoot issues and provide root cause analysis, and build a knowledge database for known issues and fixes.
Planned Maintenance, Outage Management, and Problem Management
Monitoring Infrastructure/Applications for critical environments
Continually help to improve DevOps tools, processes, and procedures
Communicate clearly with teams via Slack and Email
Share responsibility for Production operations and other critical environments

The job ad is no longer active
Job unpublished on 14 December 2020

Look at the current jobs Sysadmin Kyiv→