Our client is in the data and application integration domain. We help them enable data-driven companies to be more effective at what they do by providing them with the tools, infrastructure, and guidance to make fast informed business decisions.
Ensure high reliability and availability for Talend development and prototyping infrastructure, including upgrade and release processes and incident handling;
Provide support, including occasional on-call duty activities (during office hours), to a team maintaining Talend Cloud production infrastructure;
Maintain technical operations for our Talend Cloud development/prototyping infrastructure. Administer Linux systems (including configuration, troubleshooting, and automation), AWS/Azure Cloud infrastructure, CI/CD services. Be responsible for troubleshooting issues at systems, network, and application stacks;
Be responsible for troubleshooting Talend development and prototyping infrastructure, systems, network, and application stacks;
Collaborating with other team members in investigating and resolving technical or performance issues and performing root cause analysis;
Implementing tactical and long-terms product improvements (could be code, script or documentation related);
Develop effective alerts and responses to both identify and address reliability risks;
Work on SLI/SLOs and dashboards for error budget together with developers;
Implement automated testing solution for non-functional requirements such as HA/DR testing, security, and performance;
Participate in architecture discussions with R&D;
Design and develop cloud infrastructure blueprints;
Define and evangelize cloud-related optimizations and best practices to improve reliability and performance;
Push DevOps / SRE methodology across R&D organization and operations.
Provide cross-training to other team members and participate in code review;
At least 5 years of practical experience in software development;
Cloud engineering experience;
Bachelor in Computer Science or a relevant field;
Strong working knowledge of Linux (RedHat/CentOS) systems and applications including Tomcat, Java, Apache, ElasticSearch, ActiveMQ, Nginx Proxy;
Experience with administering AWS. Knowledge of Microsoft Azure or other IaaS/PaaS Infrastructures is a plus;
Experience with IaaC / configuration mgmt. / systems automation tools at scale (e.g. Terraform, Ansible);
Experience with network management systems and network monitoring tools such as Prometheus, Grafana, Kibana, LogStash;
Experience with Jenkins (configuration, implementing CI pipelines via code, maintenance);
Proficiency with scripting languages (Python, Bash, Groovy, etc.);
Experience with messaging tools (Kafka, Active MQ)
Experience with containers (Docker) and Kubernetes (Helm, Flux, Traefik, AKS/EKS)
Experience with big data systems and/or database administration (e.g. PostgreSQL, ElasticSearch, MongoDB, Prometheus) would be a plus;
Ability to work independently, strong interpersonal and communications skills;
Strong ownership, proactive communication, and identification of topics to work on.
GlobalLogic is a full-lifecycle product development services leader that combines chip-to-cloud software engineering expertise and vertical industry experience to help our customers design, build, and deliver their next-generation products and digital experiences. By leveraging Agile / Lean MVP methods, cutting-edge technologies, and an integrated approach to experience design and complex engineering, we empower global brands such as Microsoft, BMC, Coca Cola, Samsung, Physio Control, and Roku to develop the “next big thing” in their markets. GlobalLogic is headquartered in Silicon Valley and operates design and engineering centers around the world, where we are continuously recognized as a top innovator and employer by organizations like Zinnov and Glassdoor.
DOU company page:
Job posted on
9 October 2020