Site Reliability Engineer for OTT (IRC96461) (offline)

Job Description
Qualifications
An ideal candidate would have advanced knowledge of:
Experience working in cloud computing, virtualization and containers experience - Docker, K8S and more
Excellent problem solving skills with a desire to take on responsibility
Excellent English both written and verbal
Advanced networking knowledge- Load balancers, firewalls, VPNs, TCP/IP - troubleshooting, performance tuning
Web/Application servers - Apache, Nginx, and so on
Monitoring systems and SLA tracking
Hands on experience administering and supporting high scale Production workloads
Everything as code approach - at least 5 years of relevant work experience, including Linux systems and programming with one of languages like PowerShell, Python, Bash and so on
Participate in the 24/7 on-call shifts.
Experienced with OTT Cloud TV - an advantage

Desired:
5+ years experience managing a fast paced production team, supporting Cloud infrastructure and applicative issues, deployments and maintenance
Proven experience with monitoring production workloads using cloud and open source tools (Grafana , Prometheus , Kibana)
Experience and understanding of security and networking of production environments
Experienced with supporting open source tools such as RabbitMQ , Elastic search and Couchbase and such
Experience using and administering software version control systems (SVN, Git, etc.)
Experience with automating systems maintenance at scale
Solid understanding of current web and internet technologies like Apache, Tomcat, Nginx, CDN, DNS, Databases and so on
Experience with managing large scale infrastructures with code - experience with tools like Ansible, Terraform, CloudFormation and such
Ability to read ,understand and debug programming languages (.NET ,LUA) - big advantage

Job Responsibilities
In this role you will:
Be part of the tectonic shift of the TV industry to over the top CloudTV.
Work with cutting edge technology in the cloud.
Oversee and own overall Production deployment, maintenance and enhancements processes, procedures, as well as availability, scalability, operability and assuring top notch SLA tracking.
Be part of SRE team focused on introducing new technologies and systems, deploying services to multiple cloud environments and regions, and pushing our Production excellence and offering to the next levels.
Solve technical problems, provide guidance to various teams (internal & external), and continually improve our systems, deployments, operations, and overall cloud activities and costs.
Work Closely with devops , r&d , support and product teams
Department/Project Description
Customer is a recognized leader in the OTT TV (Over the Top TV), OVP (Online Video Platform), EdVP (Education Video Platform) and EVP (Enterprise Video Platform) markets.

We're looking for an experienced Site Reliability Engineer to join our growing Cloud Operations group. The ideal candidate will have hands-on experience developing operational based tools,
managing and supporting highly available, large Scale web applications in production

Most importantly, the right individual will be highly motivated, with a passion for delivering technical solutions in a fast-paced environment and automating anything possible.

About GlobalLogic

Headquartered in Silicon Valley, GlobalLogic employs over 11,000 designers and engineers across the globe. Analysts like NASSCOM and Zinnov have recognized us for being a top company in our field, and we are consistently nominated as a preferred employer by both global HR consultative firms and local job boards. By creating a work environment that is exciting and flexible, and by fostering growth through ongoing learning and development programs, we empower our employees to achieve both their professional and personal goals..

Company website:
https://www.globallogic.com/careers/

The job ad is no longer active
Job unpublished on 6 September 2020

Look at the current jobs Sysadmin Lviv→