cyberharbor.tech

Joined in 2025
8% answers

Cyber Harbor is a fast-growing Ukraine-based company founded by elite engineers and researchers who defended critical cyber infrastructure during the war. We build AI-powered systems shaped by real-world defense experience - ready for real-world complexity.

  • · 101 views · 10 applications · 13d

    Senior Machine Learning Engineer

    Full Remote · Ukraine · Product · 3 years of experience MilTech 🪖
    We’re expanding and looking for a Senior Machine Learning Engineer to lead and own our ML direction. You’ll be responsible for designing and scaling the entire ML stack –from research to production – driving innovation across NLP, CV, and multimodal...

    We’re expanding and looking for a Senior Machine Learning Engineer to lead and own our ML direction.


    You’ll be responsible for designing and scaling the entire ML stack –from research to production – driving innovation across NLP, CV, and multimodal pipelines. 

    We work primarily with open-source models deployed locally (not on managed cloud platforms), so you should be comfortable running, profiling, and optimizing everything on-premise.

     

    We expect you to deeply understand how things work, not just how to run them. You’ll have the autonomy to define the architecture, choose the models, and ensure high performance in local environments.

     

    What You Will Do

    • Design and build APIs and pipelines for tasks such as summarization, classification, NER, OCR, image captioning, face detection/recognition, speech-to-text, and (soon) video-to-text and RAG chat systems.
    • RAG end-to-end: chunking/normalization, index construction; hybrid retrieval (BM25 + vector), reranking (BGE/ColBERT, etc.), context policies, caching, latency budgeting, and offline evaluation (RAGAS/TruLens).
    • Run and serve models locally using vLLM, TensorRT, ONNX Runtime, or OpenVINO — ensuring efficient inference on our own GPU servers.
    • Select, fine-tune, and optimize transformer models (LLaMA, Falcon, Mistral, DeepSeek, Gemma, etc.) for specific domains and modalities.
    • Develop scalable data pipelines for model training and evaluation: annotation, augmentation, class balancing, and dataset curation.
    • Collaborate with Data Engineering on reliable message passing (Kafka / RabbitMQ / MCP) and real-time data flow.
    • Set up observability for models and infrastructure: metrics (Prometheus), dashboards (Grafana), logging (ELK Stack).
    • Automate model lifecycle: CI/CD for training, validation, and deployment via GitHub Actions or GitLab CI.
    • Continuously explore and evaluate new models and research, staying up to date with the latest open-source releases and applying them to real-world use cases.

     

    What You Need to Join Us

    • Strong expertise in NLP and Multimodal ML (text, image, audio, video)
    • Strong expertise in OCR processing and document layout recovery — ability to extract both text and structural information (tables, headers, coordinates, reading order) from scanned and digital documents using open-source tools like PaddleOCR, Tesseract, Docling, etc.
    • Deep understanding of transformer architectures and practical optimization techniques
    • Proven experience in fine-tuning and serving models locally (no managed ML cloud services) 
    • Hands-on experience with vLLM and high-performance inference optimization
    • Strong Python skills, including clean, modular service design (FastAPI, Flask, or similar)
    • Familiarity with Docker, Kubernetes, DVC, and CI/CD pipelines
    • Understanding of distributed systems (Kafka, RabbitMQ, MCP)
    • Comfortable working with databases: relational (PostgreSQL/MySQL), NoSQL (MongoDB/Cassandra), and vector stores (Qdrant, Milvus, Elasticsearch)
    • Solid foundation in system performance and observability (Prometheus, Grafana, ELK)
    • Proactive mindset: you track new model releases, benchmark them, and know what’s relevant
    • English proficiency (technical reading; conversational level is a plus)

     

    Nice to Have

    • Experience building custom NER or QA models from scratch
    • Familiarity with on-device inference (Edge AI) and optimization for limited resources (ARM, CPU-only)
    • Understanding of Active Learning, Continual Learning, or Retrieval-Augmented Generation (RAG)
    • Experience with Ray / Ray Serve for distributed inference and training

     

    Why Join Us

    • Make Real-World Impact – Our tech is proven in national defence. Now we’re scaling globally.
    • Built for This Moment – We’re at the intersection of AI, cybersecurity, and autonomy.
    • Growth-Oriented Culture – Work with a smart, driven, collaborative team
    • Military deferment – For full-time employees.
    • Flexible Schedule – Remote-friendly and results-driven
    • Competitive Compensation – We reward top talent.
    More
  • · 42 views · 2 applications · 14d

    Senior NLP / IR (RAG) Engineer

    Full Remote · Ukraine · Product · 2 years of experience · B1 - Intermediate MilTech 🪖
    We’re expanding and looking for a Senior NLP / IR (RAG) Engineer to lead and own our text-focused ML direction. You’ll be responsible for designing and scaling the text stack – from research to production – driving innovation across retrieval,...

    We’re expanding and looking for a Senior NLP / IR (RAG) Engineer to lead and own our text-focused ML direction.
     

    You’ll be responsible for designing and scaling the text stack – from research to production – driving innovation across retrieval, reranking, and generative pipelines.
     

    We work primarily with open-source models deployed locally (not on managed cloud platforms), so it would be perfect if you're comfortable running, profiling, and optimizing everything on-premise.
     

    We expect you to deeply understand how things work, not just how to run them. You’ll have the autonomy to define the architecture, choose the models, and ensure high performance in local environments.

     

    What You Will Do

    • Design & build APIs and pipelines for summarizationclassificationNERQA, and RAG chat systems.
    • End-to-end Rag: 
      • chunking/normalization, index construction; 
      • hybrid retrieval (BM25 + vector), reranking (BGE/ColBERT, etc.), context policies, caching, latency budgeting, and offline evaluation (RAGAS/TruLens).
    • Run and serve models locally using vLLM, TensorRT, ONNX Runtime – ensuring efficient inference on our own GPU servers.
    • Select, fine-tune, and optimize transformer models (LLaMA, Mistral, Falcon, DeepSeek, Gemma, etc.) for specific domains.
    • Develop scalable data pipelines for model training and evaluation: annotation, augmentation, class balancing, and dataset curation.
    • Collaborate with Data Engineering on reliable message passing (Kafka / RabbitMQ / MCP) and real-time data flow.
    • Set up observability for models and infrastructure: metrics (Prometheus), dashboards (Grafana), logging (ELK Stack).
    • Automate model lifecycle: CI/CD for training, validation, and deployment via GitHub Actions or GitLab CI.
    • Continuously explore and evaluate new models and research, staying up to date with the latest open-source releases and applying them to real-world use cases.
       

    What You Need to Join Us

    • Strong expertise in NLP and Information Retrieval (RAG, hybrid retrieval, reranking).
    • Deep understanding of transformer architectures and practical optimization techniques.
    • Experience in fine-tuning and serving models locally (no managed ML cloud services).
    • Hands-on experience with vLLM and high-performance inference optimization.
    • Strong Python skills, including clean, modular service design (FastAPI, Flask, or similar).
    • Understanding of distributed systems (Kafka, RabbitMQ, MCP).
    • English proficiency (technical reading; conversational level is a plus).
       

    Nice to Have

    • Experience building custom NER or QA models from scratch.
    • Familiarity with on-device inference (Edge AI) and optimization for limited resources (ARM, CPU-only).
    • Understanding of Active Learning, Continual Learning, and RAG evaluation frameworks.
    • Experience with Ray / Ray Serve for distributed inference and training.
    • MLOps & Databases (plus): 
      • Docker/Kubernetes, DVC, CI/CD for models; 
      • Experience with relational, NoSQL, and vector DBs.
    More
  • · 31 views · 1 application · 14d

    Senior Multimodal ML Engineer (OCR / Layout / ASR / Vision)

    Full Remote · Ukraine · Product · 2 years of experience · B1 - Intermediate MilTech 🪖
    We’re expanding and looking for a Senior Multimodal ML Engineer to lead and own our multimodal direction across documents, images, and audio (with video on the way). You’ll be responsible for designing and scaling pipelines for OCR & document layout...

    We’re expanding and looking for a Senior Multimodal ML Engineer to lead and own our multimodal direction across documents, images, and audio (with video on the way).
     

    You’ll be responsible for designing and scaling pipelines for OCR & document layout recovery, image understanding/captioning, face detection/recognition, and speech-to-text.
     

    We work primarily with open-source models deployed locally (not on managed cloud platforms), so you should be comfortable running, profiling, and optimizing everything on-premise.
    We expect you to deeply understand how things work, not just how to run them. You’ll have the autonomy to define the architecture, choose the models and ensure high performance in local environments.

     

    What You Will Do

    • Design & build APIs and pipelines for OCR, document layout recovery, image captioning, face detection/recognition, and speech-to-text; prepare foundations for video-to-text.
    • Implement robust OCR + structure extraction (tables, headers, coordinates, reading order) for scanned and digital documents.
    • Run & serve components locally using vLLM (for LLM parts), TensorRT, ONNX Runtime, or OpenVINO – ensuring efficient on-prem inference.
    • Select, fine-tune, and optimize models across CV/ASR stacks (Whisper/faster-whisper, PaddleOCR/Tesseract/Docling, YOLO/Detectron, BLIP/CLIP or similar).
    • Develop scalable data pipelines for training and evaluation; build regression tests for structure accuracy, WER/CER, and latency.
    • Collaborate with Data Engineering on reliable message passing (Kafka / RabbitMQ / MCP) and real-time data flow.
    • Set up observability for models and infrastructure: metrics (Prometheus), dashboards (Grafana), logging (ELK Stack).
    • Automate model lifecycle: CI/CD for training, validation, and deployment via GitHub Actions or GitLab CI.
    • Continuously explore and evaluate new models and research, staying up to date with the latest open-source releases and applying them to real-world use cases.

     

    What You Need to Join Us

    • Strong expertise in OCR processing and document layout recovery – ability to extract both text and structural information using open-source tools like PaddleOCR, Tesseract, Docling, etc.
    • Solid experience with ASR pipelines (Whisper stack), timestamps/diarization, and post-processing.
    • Deep understanding of transformer/CV architectures and practical optimization techniques.
    • Proven experience in fine-tuning and serving models locally (no managed ML cloud services).
    • Hands-on experience with high-performance inference optimization (TensorRT/ONNX/OpenVINO; vLLM for LLM-backed pieces).
    • Strong Python skills, including clean, modular service design (FastAPI, Flask, or similar).
    • Understanding of distributed systems (Kafka, RabbitMQ, MCP).
    • English proficiency (technical reading; conversational level is a plus).

     

    Nice to Have

    • Experience with document layout detection (YOLO/Detectron), table structure recovery, and image captioning.
    • Familiarity with on-device inference (Edge AI) and optimization for limited resources (ARM, CPU-only).
    • Understanding of Active/Continual Learning or multimodal RAG.
    • Experience with Ray / Ray Serve for distributed inference and training.
    • MLOps & Databases: 
      • Docker/Kubernetes, DVC, CI/CD for models; 
      • Experience with relational, NoSQL, and vector DBs.
         
    More
  • · 161 views · 54 applications · 13d

    Product Designer

    Full Remote · Ukraine · Product · 4 years of experience · B1 - Intermediate MilTech 🪖
    As part of the product team, you’ll lead and own the Human Interface Design direction of new features and experiences across 5+ SaaS products. From early concept explorations to high-fidelity execution, you’ll work closely with the engineering teams...

    As part of the product team, you’ll lead and own the Human Interface Design direction of new features and experiences across 5+ SaaS products. 

     

    From early concept explorations to high-fidelity execution, you’ll work closely with the engineering teams and the current designer to turn complex technical concepts into clear, meaningful interactions with high precision in the details. Attention to detail is mandatory for this role, as your decisions will directly influence Ukraine's national defence. 

     

    What You Will Do

    • Create sketches, wireframes, prototypes, and polished designs that bring clarity to the product experience.
    • Collaborate with the engineering team to create simple, intuitive, human-centered products that help users save time and defend our country in a more effective way.
    • Defend your ideas before the engineering and management teams to create the best possible solution, setting the standard in the market.
    • Quickly translate complex requirements and technical data into clear and engaging interfaces.
    • Contribute to a well-thought design system that helps the engineering team create or change interfaces quickly without avoidable mistakes.
    • Create presentations to showcase our current and future products, individual features, unique capabilities, and plans for clients and end users.
       

    What You Need to Join Us

    • 4+ years of work experience with a focus on product design and UI across various domains. 
    • An eye for visual detail in proportion, type, motion, and interaction.
    • Experience shipping digital products from start to finish, including documentation and presentation of the developed work.
    • Ability to clearly communicate concepts and designs through prototypes, sketches, high-fidelity comps, and presentations.
    • Demonstrated creative and innovative problem-solving and a willingness to take the lead where needed.

     

    Why Join Us

    • Make Real-World Impact – Our tech is proven in national defence. Now we’re scaling globally.
    • Built for This Moment – We’re at the intersection of AI, cybersecurity, and autonomy.
    • Growth-Oriented Culture – Work with a smart, driven, collaborative team
    • Military deferment – For full-time employees.
    • Flexible Schedule – Remote-friendly and results-driven
    • Competitive Compensation – We reward top talent.
    More
  • · 122 views · 20 applications · 13d

    Middle Data Annotator / Data Labeler

    Full Remote · Ukraine · Product · 1 year of experience · B1 - Intermediate MilTech 🪖
    Шукаємо фахівця для підготовки датасетів, що використовуються для перевірки якості та тренування власних AI/ML моделей. Основне завдання – анотація текстів, отриманих з відсканованих документів (OCR), та визначення іменованих сутностей (NER), таких як...

    Шукаємо фахівця для підготовки датасетів, що використовуються для перевірки якості та тренування власних AI/ML моделей. 

     

    Основне завдання – анотація текстів, отриманих з відсканованих документів (OCR), та визначення іменованих сутностей (NER), таких як номери телефонів, адреси, назви організацій тощо.

     

    Ви отримуватимете набори сканів або фотографій документів і перетворюватимете їх у структурований текст із використанням Markdown, додатково позначаючи ключові ідентифікатори. 

     

    Окрім роботи з текстами передбачена обробка аудіо- та відеоматеріалів: створення транскриптів із таймкодами та діаризацією мовців, NER у транскриптах, нормалізація і верифікація якості; для відео - сегментація на сцени/епізоди, кадр-точні таймкоди, опис подій, позначення екранних написів, логотипів, облич та об’єктів.
     

    Бажаним є досвід роботи з Label Studio, CVAT або подібними інструментами, базові навички програмування, а також досвід участі у тренуванні чи тестуванні AI/ML моделей.

     

    Робота віддалена, в рамках розробки власного продукту, що знаходиться на передовій сучасних технологій та орієнтований на державний сектор. 

    Наша компанія є критично важливою для Збройних сил України; за потреби надаємо бронювання та офісну техніку.

    More
  • · 108 views · 9 applications · 13d

    Data Annotator / Data Labeling Team Lead

    Full Remote · Ukraine · Product · 2 years of experience · B1 - Intermediate MilTech 🪖
    Ми розширюємося і шукаємо Team Lead з підготовки датасетів, який керуватиме in-house командою анотації та залишатиметься hands-on у розмітці. Плануватимете та координуватимете розмітку мультимодальних даних - текст OCR/NER, зображення, аудіо, відео -...

    Ми розширюємося і шукаємо Team Lead з підготовки датасетів, який керуватиме in-house командою анотації та залишатиметься hands-on у розмітці. Плануватимете та координуватимете розмітку мультимодальних даних - текст OCR/NER, зображення, аудіо, відео - формалізуєте гайдлайни й QA та триматимете стабільність, якість і дедлайни постачання датасетів.

     

    Що ви робитимете

    • Планувати, координувати та контролювати роботу in-house команди.
    • Будувати робочий пайплайн з нуля: розробляти гайдлайни, проводити онбординг і навчання.
    • Впроваджувати та підтримувати стандарти якості: single-pass, double-blind, golden set, spot-check.
    • Проводити QA та рев’ю розмітки, відстежувати узгодженість анотацій. 
    • Виконувати hands-on частину роботи: розмітка OCR/NER по текстам, класифікація зображень, аудіо та відео, структурований вивід у Markdown та погоджені формати даних.


    Що потрібно для ролі

    • Досвід управління командами розмітки та підготовки датасетів - від 1 року. 
    • Впевнене володіння інструментами Label Studio, CVAT або альтернативами.
    • Розуміння підходів до контролю якості розмітки та метрик узгодженості.
    • Досвід розробки гайдлайнів та інструкцій для анотації.
    • Уважність до деталей, системність, вміння планувати та тримати дедлайни.

     

    Чому саме ми

    • Реальний вплив - можливість працювати над власним продуктом на передовій сучасних технологій в держ. секторі.
    • Бронювання - наша компанія є критично важливою для Збройних сил України.
    • Культура зростання - робота з найкращими фахівцями ринку
    • Фокус на результат - підтримуємо віддалений формат та гнучкий графік.
    More
Log In or Sign Up to see all posted jobs