Senior Machine Learning Engineer Offline

MilTech 🪖

We’re expanding and looking for a Senior Machine Learning Engineer to lead and own our ML direction.


You’ll be responsible for designing and scaling the entire ML stack –from research to production – driving innovation across NLP, CV, and multimodal pipelines. 

We work primarily with open-source models deployed locally (not on managed cloud platforms), so you should be comfortable running, profiling, and optimizing everything on-premise.

 

We expect you to deeply understand how things work, not just how to run them. You’ll have the autonomy to define the architecture, choose the models, and ensure high performance in local environments.

 

What You Will Do

  • Design and build APIs and pipelines for tasks such as summarization, classification, NER, OCR, image captioning, face detection/recognition, speech-to-text, and (soon) video-to-text and RAG chat systems.
  • RAG end-to-end: chunking/normalization, index construction; hybrid retrieval (BM25 + vector), reranking (BGE/ColBERT, etc.), context policies, caching, latency budgeting, and offline evaluation (RAGAS/TruLens).
  • Run and serve models locally using vLLM, TensorRT, ONNX Runtime, or OpenVINO — ensuring efficient inference on our own GPU servers.
  • Select, fine-tune, and optimize transformer models (LLaMA, Falcon, Mistral, DeepSeek, Gemma, etc.) for specific domains and modalities.
  • Develop scalable data pipelines for model training and evaluation: annotation, augmentation, class balancing, and dataset curation.
  • Collaborate with Data Engineering on reliable message passing (Kafka / RabbitMQ / MCP) and real-time data flow.
  • Set up observability for models and infrastructure: metrics (Prometheus), dashboards (Grafana), logging (ELK Stack).
  • Automate model lifecycle: CI/CD for training, validation, and deployment via GitHub Actions or GitLab CI.
  • Continuously explore and evaluate new models and research, staying up to date with the latest open-source releases and applying them to real-world use cases.

 

What You Need to Join Us

  • Strong expertise in NLP and Multimodal ML (text, image, audio, video)
  • Strong expertise in OCR processing and document layout recovery — ability to extract both text and structural information (tables, headers, coordinates, reading order) from scanned and digital documents using open-source tools like PaddleOCR, Tesseract, Docling, etc.
  • Deep understanding of transformer architectures and practical optimization techniques
  • Proven experience in fine-tuning and serving models locally (no managed ML cloud services) 
  • Hands-on experience with vLLM and high-performance inference optimization
  • Strong Python skills, including clean, modular service design (FastAPI, Flask, or similar)
  • Familiarity with Docker, Kubernetes, DVC, and CI/CD pipelines
  • Understanding of distributed systems (Kafka, RabbitMQ, MCP)
  • Comfortable working with databases: relational (PostgreSQL/MySQL), NoSQL (MongoDB/Cassandra), and vector stores (Qdrant, Milvus, Elasticsearch)
  • Solid foundation in system performance and observability (Prometheus, Grafana, ELK)
  • Proactive mindset: you track new model releases, benchmark them, and know what’s relevant
  • English proficiency (technical reading; conversational level is a plus)

 

Nice to Have

  • Experience building custom NER or QA models from scratch
  • Familiarity with on-device inference (Edge AI) and optimization for limited resources (ARM, CPU-only)
  • Understanding of Active Learning, Continual Learning, or Retrieval-Augmented Generation (RAG)
  • Experience with Ray / Ray Serve for distributed inference and training

 

Why Join Us

  • Make Real-World Impact – Our tech is proven in national defence. Now we’re scaling globally.
  • Built for This Moment – We’re at the intersection of AI, cybersecurity, and autonomy.
  • Growth-Oriented Culture – Work with a smart, driven, collaborative team
  • Military deferment – For full-time employees.
  • Flexible Schedule – Remote-friendly and results-driven
  • Competitive Compensation – We reward top talent.

Required languages

Ukrainian C1 - Advanced

The job ad is no longer active

Look at the current jobs ML / AI →

Loading...