Senior AI/ML Backend Engineer (Python, Computer Vision, LLMs, Architect-level)
We are currently looking for a Senior/Architect-level AI/ML Backend Engineer to join a long-term project focused on large-scale video analysis and intelligent metadata extraction. The ideal candidate will have strong experience in Computer Vision, LLMs, and backend architecture for high-performance systems.
Requirements:
- 3+ years of experience in backend or AI/ML development
- Strong Python skills with a focus on backend systems and AI pipelines
- Solid expertise in Computer Vision, facial recognition, image embeddings, video frame analysis
- Practical experience with LLMs, Whisper, OCR pipelines, NLP
- Proficient with MongoDB, REST APIs, microservice architectures
- Comfortable designing and deploying scalable systems on AWS, GCP, and on-prem GPU infrastructure
- Experience in DevOps, high-load system design, and performance optimization
Ability to translate business logic into backend algorithms and rules
Will be a plus:
Experience with hybrid cloud/on-prem deployments
Familiarity with video codecs, encoding/decoding pipelines
Knowledge of multimodal AI systems (vision + text)
Responsibilities:
- Design and implement scalable AI-powered backend services using Python
- Work on computer vision tasks: object detection, OCR, image classification, facial recognition
- Integrate and optimize LLM/NLP models for multilingual transcription and video content understanding
- Architect and deploy solutions to process terabytes of video data frame-by-frame
- Collaborate with frontend teams (React-based UIs) to deliver end-to-end features
- Address DevOps and infrastructure challenges for high performance and fault tolerance
- Build rule-based backend logic to support metadata-driven automation
Tech Stack:
Languages & Frameworks: Python, MongoDB, REST APIs
AI/ML: Computer Vision, Whisper, LLMs, OCR, facial recognition
Architecture: Microservices, scalable backends, cloud + on-prem GPU infrastructure
Frontend (integrations): React
Cloud & Infra: AWS, GCP, on-prem GPU clusters
Project:
A cutting-edge platform that processes and analyzes massive volumes of video data by extracting frames and applying advanced AI techniques (computer vision, facial recognition, OCR, LLMs) to derive metadata. This metadata is used to automate business processes and drive decisions. The system combines backend rule engines with AI-driven pipelines and supports both cloud and on-premise GPU-based deployments.