ML Evaluation Engineer (LLM / Data Quality)

$$$$

About the Role

We’re looking for a Machine Learning Engineer focused on evaluation and data quality to help improve the performance of modern ML and LLM systems.

This role is centered around understanding model behavior, identifying failure patterns, and building strong evaluation and feedback loops. You won’t be training models directly — instead, you’ll play a critical role in improving model outcomes through better data, analysis, and experimentation.

If you enjoy analytical problem-solving, working with real-world data, and making ML systems measurably better — this role is for you.

 

What You’ll Do

  • Analyze and improve datasets used in ML and LLM workflows
  • Perform detailed error analysis on model outputs (qualitative and quantitative)
  • Design and implement evaluation frameworks, benchmarks, and quality checks
  • Identify failure modes, data gaps, and improvement opportunities
  • Support model experimentation with structured insights and feedback
  • Collaborate closely with ML engineers and researchers to improve system performance
  • Document evaluation methodologies, findings, and recommendations
  • Help maintain high standards for data quality, consistency, and bias awareness

 

What We’re Looking For

  • 5+ years of experience in Machine Learning or Applied ML roles
  • Strong understanding of ML fundamentals and evaluation methodologies
  • Hands-on experience working with real-world datasets for ML or LLM systems
  • Strong Python skills
  • Experience with data validation, evaluation pipelines, and error analysis
  • Strong analytical thinking and attention to detail
  • Ability to work independently and clearly communicate insights

 

Nice to Have

  • Experience with LLMs, NLP, or generative AI systems
  • Background in data annotation, QA, or evaluation-heavy ML environments
  • Experience with experiment tracking, prompt evaluation, or benchmark design
  • Exposure to bias analysis, robustness testing, or dataset auditing
  • Research or competition-based ML background

 

Why This Role

  • Work on cutting-edge ML and LLM systems
  • Direct impact on model quality and real-world performance
  • High ownership in shaping evaluation and data practices
  • Collaborative, fast-moving engineering environment

Required languages

English C1 - Advanced
Published 15 April
8 views
·
3 applications
To apply for this and other jobs on Djinni login or signup.
Loading...