Senior Backend Engineer (Python / LLM / AI Systems)
About the Role
We are looking for a Senior Python Backend Engineer to join a Client’s SaaS team and build production-grade backend services and large-scale LLM-powered systems.
This role requires strong backend engineering, deep service design experience, and hands-on work with LLM inference in production, including self-hosted models. You will work on scalable AI-driven services, optimize inference pipelines, and ensure reliability, performance, and cost-efficiency of LLM workloads in real-world production environments.
Location: Remote
Cooperation Type: Full-time, long-term
Experience Level: 7+ years
Start: ASAP
Responsibilities
• Design, develop, and maintain scalable backend services using Python and FastAPI
• Build and operate production-grade LLM inference pipelines
• Deploy, host, and optimize self-hosted LLM models (not only external APIs)
• Design systems for LLM request routing, batching, caching, and latency optimization
• Build high-performance APIs for AI-driven features and SaaS product workflows
• Process large-scale structured and unstructured data (text, embeddings, events, documents)
• Design and optimize PostgreSQL schemas and queries for performance and concurrency
• Build asynchronous and event-driven systems (RabbitMQ or similar)
• Monitor, debug, and optimize production systems under real traffic and load
• Collaborate with ML, Product, and Infrastructure teams on AI system architecture
• Ensure reliability, observability, scalability, and cost-efficiency of AI services
• Write clean, maintainable, and well-documented code
Requirements
• 7+ years of backend engineering experience with Python
• Strong hands-on experience with FastAPI (or Flask / Django REST)
• Hands-on experience running LLM inference in production
• Experience hosting and operating self-hosted LLM models
• Strong service design and distributed systems experience
• Experience building LLM pipelines (orchestration, routing, prompt pipelines, RAG, etc.)
• Strong experience with PostgreSQL (indexes, optimization, concurrency)
• Experience with message queues (RabbitMQ or similar)
• Experience with Docker and CI/CD
• Experience optimizing latency, throughput, and cost in high-load services
• Ability to start quickly, take ownership, and work independently
• English: Upper-Intermediate or higher
Nice to Have
• Experience with self-hosted LLM serving (vLLM, TGI, Triton, TensorRT, etc.)
• Experience with vector databases and embeddings pipelines
• Experience with GPU workloads / inference optimization
• Kubernetes or container orchestration
• Experience with RAG pipelines
• MLOps / ML infrastructure experience
• Experience with ETL / ELT pipelines
What We Offer
• Competitive compensation based on experience (gross system)
• Fully remote cooperation
• Fast hiring process and quick start
• Opportunity to work on a real AI product with immediate impact
• Direct collaboration with a client’s engineering team
What happens after you apply
• Quick CV review
• Short recruiter call
• Technical interview with LITSLINK team & technical meeting with the Client
• Fast decision & offer
Required skills experience
| Python | 7 years |
| FastAPI | 3 years |
| REST API | 5 years |
| PostgreSQL | 5 years |
| Async | 3 years |
| asyncio | 3 years |
| RabbitMQ | 2 years |
| Docker | 2 years |
| Design Systems | 4 years |
| AI/ML | 2 years |
| LLM inference in production | 2 years |
| LLM pipelines / orchestration | 2 years |
| Model serving (self-hosted LLMs) | 2 years |
Required domain experience
| Machine Learning / Big Data | 2 years |
| SaaS | 3 years |
Required languages
| English | B2 - Upper Intermediate |