Senior Data and ML Platform Engineer
We are looking for Senior Data & ML Platform Engineer. It will be full-time for about 6 months to start. We are flexible on location and rate for the right person.
Mission
• Build and run a scalable, traceable, FDA-ready data platform bridging on-prem DGX and AWS for our 2026 submission.
Key responsibilities
• Implement ingestion + QC automation starting with CT DICOM, expanding to video/C-Arm and radiology reports.
• Implement/operate distributed processing on AWS Batch (Spot) for large-scale QC + predictions + derivatives.
• Deploy ClearML for dataset versioning/lineage and experiment tracking (runs/metrics/artifacts; provenance).
• Optimize training data access using Lance (or equivalent) for fast loading and incremental updates.
• Build a PostgreSQL-backed enrichment service for metadata/labels/predictions independent of raw media (optional search via OpenSearch/text+vector).
• Integrate/operate labeling workflows (Encord preferred; alternatives acceptable), incl. RBAC/QC/audit trail + algorithmic label ingestion.
• Establish a governed clinical validation environment meeting 21 CFR Part 11 expectations (access control, audit trail/WORM, provenance) and HIPAA/PHI handling.
Tech stack
• Python; AWS (S3, Batch); Postgres; DICOM / PACS (Orthanc); ClearML; Lance; media processing (ffmpeg).
Required domain experience
| Healthcare / MedTech | 3 years |
Required languages
| English | B2 - Upper Intermediate |