Math PhD + NVIDIA NIM/CUDA Engineer

GlobalLogic Top Employer

$$$$

Our team is tackling high-complexity challenges in computational modeling and GenAI-driven analytics. We are currently building a next-generation acceleration layer for predictive simulations that requires a blend of deep theoretical knowledge and low-level GPU optimization. You will be responsible for ensuring our mathematical models don’t just work they run at the theoretical limits of available hardware.

Requirements

PhD in Computer Science, Physics, Mathematics, or a related quantitative field with a focus on high-performance computing or numerical methods
3+ years of experience in Python Engineering (Middle/Senior level) with a deep understanding of asynchronous programming and system architecture
1+ year of hands-on experience with NVIDIA NIM and Triton Inference Server for deploying optimized LLMs or specialized AI models
Strong proficiency in CUDA C++ and CuPy for developing and accelerating custom GPU kernels and parallel algorithms
Proven track record of translating complex theoretical papers/models into production-ready, GPU-accelerated code

Job responsibilities

Design and Architect high-performance inference pipelines using NVIDIA NIM to serve LLMs and custom generative models at scale
Develop and Optimize custom GPU-accelerated operators using CuPy and raw CUDA kernels to bypass CPU bottlenecks in mathematical computations
Profile and Debug GPU memory utilization and compute kernels using NVIDIA Nsight Systems/Compute to hit aggressive latency targets
Bridge the Gap between research-grade prototypes and production systems, ensuring code is modular, tested, and scalable
Implement GPU-efficient data structures for real-time processing of large-scale industrial or scientific datasets
Collaborate with cross-functional teams to integrate specialized AI microservices into broader cloud-native architectures (Kubernetes/Azure)
Stay at the forefront of GPU computing, evaluating new NVIDIA hardware features (like H100 Transformer Engines) for project applicability

Tools

NVIDIA: NIM, CUDA, CuPy, TensorRT, Triton Inference Server, Nsight, DCGM
Backend/AI: Python (Expert), FastAPI, PyTorch, NumPy/SciPy, Numba
Platform: Docker, Kubernetes, Helm, gRPC/REST, Prometheus/Grafana

Required languages

English	C2 - Proficient
Ukrainian	Native

Published 3 April

4 views

0 applications

To apply for this and other jobs on Djinni login or signup.

Only from 5 years of experience
Full Remote
Worldwide
Countries where we consider candidates
- English C2 - Proficient
- Ukrainian Native

(Other)

Employment: Fulltime
Domain: Other
Outsource

Apply for the job

📊 $3000-5000 Average salary range of similar jobs in analytics →