Computer Vision Specialist / Detection Model Training Engineer (Python, Deep Learning)
Responsibilities:
- Design and implement scalable training pipelines for image and video detection models using large-scale real-world datasets.
- Fine-tune and evaluate transformer-based and CNN-based detection models (e.g., YOLOv8, DINO, SAM, DETR, Mask R-CNN) for tasks such as object detection, segmentation, and visual tracking.
- Collaborate with data engineers to preprocess, clean, and structure large volumes of image/video data.
- Optimize model training for speed and accuracy on multi-GPU clusters (e.g., using PyTorch DDP, DeepSpeed, or Hugging Face Accelerate).
- Continuously benchmark model performance using established metrics (mAP, IoU, AR, etc.) and custom evaluation scripts.
Experiment with techniques such as multi-scale training, data augmentation, semi-supervised learning, and prompt-based vision models (e.g., SAM or DINO-based pipelines).
Requirements:
- Strong proficiency in Python and deep learning frameworks (PyTorch preferred).
- Hands-on experience with training or fine-tuning computer vision models on large-scale image/video datasets (COCO, LVIS, ImageNet, custom video datasets, etc.).
- Familiarity with modern detection and segmentation frameworks such as YOLO (v8-v12 ), DINO, SAM, DETR, and other transformer-based vision models.
- Experience with distributed training setups (e.g., DDP, DeepSpeed, FSDP), mixed-precision training, and memory optimization.
- Solid grasp of computer vision fundamentals: object detection, instance segmentation, image augmentations, and feature pyramids.
- Experience managing training workflows: checkpoints, hyperparameter tuning, logging (e.g., TensorBoard, Weights & Biases), and model versioning.
Bonus:
- Knowledge of efficient model deployment techniques: ONNX, TensorRT, quantization, pruning, or distillation.
- Experience integrating detection models into real-time pipelines or edge environments.
- Contributions to open-source vision libraries or published research in vision conferences (e.g., CVPR, ICCV, ECCV).
Preferred Qualifications:
- Familiarity with large-scale annotation workflows or synthetic data generation for visual tasks.
- Experience evaluating vision models with custom benchmarks or using tools like FiftyOne, CVAT, or Roboflow.
Required languages
English | B1 - Intermediate |
๐
Average salary range of similar jobs in
analytics โ
Loading...