Middle Site Reliability Engineer, Online Retailer (Poland) (offline)

About the vacancy
Our client is one of the biggest online retailers worldwide with annual revenue of £1 billion. Over the years we helped the client develop web-portals, mobile apps, delivery control systems, staff management tools, data storage and much more. The systems we’ve built together are in operation 24/7, contributing to the client’s success.

Site Reliability Engineering is a new role, first introduced by Google, that combines the skills of developers and ops to deliver more reliable, scalable software. The goal is to analyze a diverse set of applications (primarily built using Java, Oracle, AWS, Google Cloud services and a number of other technologies) and bind them into a reliable self-healing suite, working within defined reliability requirements. This requires proactive work to ensure observability, analyze potential bottlenecks and suggest their fixes before they become a production incident.

This position may be of interest to DevOps engineers who would like to get closer to the code or get valuable specialization with a focus on JVM stack. The position may also appeal to developers who are interested in how large scale systems operate and what happens to the code after it is live.

Responsibilities
Analyze and improve the availability, latency, performance, and efficiency of the applications
Proactive support of production applications (both in-office and out of hours) across a range of domains, these are mainly written in Java and use Oracle databases.
Improve the monitoring and alerting of the applications
Capacity planning and provisioning
Improve and standardize build pipelines, identify and reduce any areas of manual toil through automation.
Consult in areas of reliability and scalability for the development of new applications.
Work together with teams in other departments to find solutions
Conduct periodic on-call duties
Must have
Experience in analyzing and troubleshooting production systems
Experience with modern software development, preferably in Java
Deep understanding of Linux and UNIX-based systems
Familiarity with Agile software development practices
Understanding of TDD principles
Solid knowledge of SQL and modern databases
Experience with CI/CD-systems
Experience with networking (TCP/UDP, ICMP, DNS, etc), OSI Layers, infrastructure services, and security
Experience with software monitoring and alerting systems
Good English communication and problem-solving skills
Would be a plus
Familiarity with cloud technologies
Experience with Docker and Kubernetes
Experience with NoSQL databases

About Cloudmore

Cloudmore is on a mission to forever change how products and services are sold, bought, and consumed.  Simply said - we help B2B companies create, market, sell, and manage their offers online. Being a SaaS company in a scale-up stage, it is your chance to be part of our incredible journey in reaching new heights.

Cloudmore is an equal opportunities employer that supports flexible work practices. Most of our roles are targeted to a specific hub, but pending agreement, can be based out of a choice of offices in Stockholm, Tallinn or Belfast. Equally, for the right candidate, the role can be location independent or home-based.

Working for Cloudmore brings many benefits. We want our team members to have a healthy work-life balance and we also understand the importance of personal and family commitments. We are outcome focused and feel that our staff should have the autonomy and freedom to achieve their best.

If that sounds good to you, apply! 😊

Company website:
https://web.cloudmore.com/about-cloudmore/

The job ad is no longer active

Look at the current jobs Sysadmin Relocate→