Nvidia NIM/Cuda Developer (IRC278439)

GlobalLogic Top Employer

Description

GlobalLogic has been engaged in exploring opportunities to implement ML/AL/GenAI-powered applications for multinational industrial conglomerate since 2023, aiming to enhance business efficiency.

You will be working on projects implying complex data processing pipelines for predictive maintenance anaylitics.

Projects tech stack: Kubernetes, Terraform, Helm, LLM, ML, AI, Asyncio, Python, Pandas, HayStack, Azure Blob Storage, Azure DevOps, SQL Alchemy, Docker, Docker Compose, PySpark, PostgreSQL, FastAPI

Requirements

  • 4+ years in backend/performance engineering (Python and/or C++) with production services.
  • 1+ year hands-on with GPU-accelerated AI/ML inference (Triton/TensorRT/CUDA) in production.
  • Proven delivery of NIM- or Triton-based microservices behind REST/gRPC (autoscaling, rollout strategies, monitoring).
  • Practical experience profiling and removing GPU bottlenecks (memory, kernels, launch configs, batching, concurrency).
  • Strong system design skills (throughput/latency/SLA) and high code quality (tests, reviews, docs).

Job responsibilities

  • Design, build, and operate NIM-powered microservices (LLM, Embeddings, Reranker, ASR/TTS, VLM) for product features.
  • Optimize inference end-to-end โ€“ TensorRT-LLM engines, quantization, batching, concurrency, KV-cache, CUDA kernels where needed.
  • Package & deploy via Triton + Kubernetes (Helm), set up GPU scheduling, MIG/MPS, canary/blue-green strategies.
  • Expose stable APIs (REST/gRPC), versioning, auth/rate-limits; deliver SDK/client wrappers for internal teams.
  • Instrument โ€“ metrics/logs/traces (Prometheus/Grafana/Otel), DCGM, alerts, SLOs, runbooks, cost/capacity planning.
  • Collaborate with DS/Platform/Product on priorities, model choices, and integration paths; ship incrementally.
  • Maintain quality & security: tests, CI/CD, IaC, dependency hygiene, secrets management, network policies.
  • Document designs, benchmarks, and operational playbooks.

 

Tools

  • NVIDIA โ€“ NIM, Triton, TensorRT / TensorRT-LLM, CUDA, cuDNN, cuBLAS, NCCL, Nsight, DCGM
  • Backend/AI โ€“ Python, C++, FastAPI, AsyncIO, PyTorch/ONNX, vector search/RAG
  • Platform โ€“ Docker, Kubernetes, Terraform, Helm, Azure Blob/DevOps, PostgreSQL, Redis, Kafka, Grafana/Prometheus/Otel

Required languages

English B2 - Upper Intermediate
Ukrainian Native
Published 17 March
4 views
ยท
0 applications
To apply for this and other jobs on Djinni login or signup.
Loading...