Mid-Level Site Reliability Engineer (SRE)
About Corvex
Corvex delivers unparalleled cloud-based AI infrastructure, featuring cutting-edge NVIDIA GPUs that combine exceptional reliability, security, performance, and value. We're building a world-class experience for developers and data scientists across enterprise and AI-native organizations-empowering professionals to focus exclusively on training, fine-tuning, and inference of their AI models, while we manage the nuts and bolts of our premium infrastructure.
Company and Position Description
Corvex is hiring a Mid-Level Site Reliability Engineer with 3-5 years of experience to support the development and operation of our AI-focused cloud platform. You will work across automation, infrastructure-as-code, Kubernetes, and private cloud environments, while helping ensure high reliability and performance. Strong written communication is essential, as you will interact with distributed engineering teams and occasionally with clients.
This role requires 5-6 hours of overlap with US Eastern Time and participation in a rotating on-call schedule (1 week every 6 weeks).
What You’ll Do
- Support infrastructure-as-code workflows using Terraform and Ansible
- Assist in building and operating Kubernetes clusters
- Troubleshoot system, network, and infrastructure issues
- Contribute to automation, monitoring, and CI/CD pipeline enhancements
- Work with private cloud environments (OpenStack strongly preferred)
- Produce clear, professional internal and client-facing written communication
- Participate in the on-call rotation and help improve incident response
What We’re Looking For
- 3-5 years of experience in SRE, DevOps, Systems Engineering, or similar roles
- Hands-on experience with Terraform, Ansible, and Kubernetes
- Solid troubleshooting skills across Linux and networks
- Familiarity with OpenStack or other private cloud technologies (non-commercial experience welcome)
- Strong written English and professional communication skills
- Ability to maintain required overlap with US Eastern Time
- Willingness to participate in on-call rotation (1 week every 6 weeks)
What We Offer
- Competitive salary
- A chance to help define a new category of AI infrastructure
- Greenfield architecture - build the product you’ve always wanted to use
- High trust and autonomy, with deep impact on platform direction
- Remote-first culture with the option to collaborate in person as we scale
- Small, highly skilled team and zero bureaucracy
Required skills experience
| SRE | 3 years |
| DevOps | 3 years |
| Terraform | 2 years |
| Ansible | 2 years |
| Kubernetes | 2 years |
| OpenStack | 2 years |
Required languages
| English | B1 - Intermediate |