Senior Backend Engineer (Python / LLM / AI Systems)

About the Role

We are looking for a Senior Python Backend Engineer to join a Client’s SaaS team and build production-grade backend services and large-scale LLM-powered systems.

This role requires strong backend engineering, deep service design experience, and hands-on work with LLM inference in production, including self-hosted models. You will work on scalable AI-driven services, optimize inference pipelines, and ensure reliability, performance, and cost-efficiency of LLM workloads in real-world production environments.

 

Location: Remote
Cooperation Type: Full-time, long-term
Experience Level: 7+ years
Start: ASAP

 

Responsibilities

• Design, develop, and maintain scalable backend services using Python and FastAPI
• Build and operate production-grade LLM inference pipelines
• Deploy, host, and optimize self-hosted LLM models (not only external APIs)
• Design systems for LLM request routing, batching, caching, and latency optimization
• Build high-performance APIs for AI-driven features and SaaS product workflows
• Process large-scale structured and unstructured data (text, embeddings, events, documents)
• Design and optimize PostgreSQL schemas and queries for performance and concurrency
• Build asynchronous and event-driven systems (RabbitMQ or similar)
• Monitor, debug, and optimize production systems under real traffic and load
• Collaborate with ML, Product, and Infrastructure teams on AI system architecture
• Ensure reliability, observability, scalability, and cost-efficiency of AI services
• Write clean, maintainable, and well-documented code

 

Requirements

7+ years of backend engineering experience with Python
• Strong hands-on experience with FastAPI (or Flask / Django REST)
Hands-on experience running LLM inference in production
• Experience hosting and operating self-hosted LLM models
• Strong service design and distributed systems experience
• Experience building LLM pipelines (orchestration, routing, prompt pipelines, RAG, etc.)
• Strong experience with PostgreSQL (indexes, optimization, concurrency)
• Experience with message queues (RabbitMQ or similar)
• Experience with Docker and CI/CD
• Experience optimizing latency, throughput, and cost in high-load services
• Ability to start quickly, take ownership, and work independently
English: Upper-Intermediate or higher

 

Nice to Have

• Experience with self-hosted LLM serving (vLLM, TGI, Triton, TensorRT, etc.)
• Experience with vector databases and embeddings pipelines
• Experience with GPU workloads / inference optimization
• Kubernetes or container orchestration
• Experience with RAG pipelines
• MLOps / ML infrastructure experience
• Experience with ETL / ELT pipelines

 

What We Offer

• Competitive compensation based on experience (gross system)
• Fully remote cooperation
• Fast hiring process and quick start
• Opportunity to work on a real AI product with immediate impact
• Direct collaboration with a client’s engineering team

 

What happens after you apply

• Quick CV review
• Short recruiter call
• Technical interview with LITSLINK team & technical meeting with the Client
• Fast decision & offer

Required skills experience

Python 7 years
FastAPI 3 years
REST API 5 years
PostgreSQL 5 years
Async 3 years
asyncio 3 years
RabbitMQ 2 years
Docker 2 years
Design Systems 4 years
AI/ML 2 years
LLM inference in production 2 years
LLM pipelines / orchestration 2 years
Model serving (self-hosted LLMs) 2 years

Required domain experience

Machine Learning / Big Data 2 years
SaaS 3 years

Required languages

English B2 - Upper Intermediate
Kubernetes, CI/CD, Cloud Platforms (AWS, GCP, Azure), Vector Databases, ElasticSearch, Pandas, NumPy, ETL
Published 22 January
31 views
·
16 applications
86% read
To apply for this and other jobs on Djinni login or signup.
Loading...