Senior AI/ML Engineer (Applied AI, Evaluation-Driven)

SolvIT AS

Who:
We’re looking for a Senior AI/ML Engineer who combines strong machine learning fundamentals with a disciplined, evaluation-first approach to improving production AI systems.

What:
You’ll improve real AI systems used by real users — through metrics, experimentation, system design, and iteration.

When:
Start ASAP.

Where:
Remote (any location).

Type:
Full-time.

English:
Upper-intermediate or Advanced required.

The Role in a Nutshell

This is not a pure research position.

You’ll work on improving production AI systems through structured evaluation, diagnosing failures, refining prompts and agent workflows, and designing measurable improvements.

You won’t be working in isolation — you’ll improve systems that users depend on daily.

Responsibility Breakdown (Approximate)

AI evaluation & KPI design — ~30%
Prompt and multi-agent system design — ~30%
ML systems (recommendation, optimization, etc.) — ~30%
Engineering integration — ~10%

What You’ll Work On

AI Evaluation & System Quality (Core Responsibility)

Design evaluation strategies for LLM and agent workflows
Define metrics and KPIs for AI system performance
Build and maintain evaluation datasets
Systematically debug production AI failures
Compare system behavior against baselines
Run structured experiments and measure real impact

This is a central part of the role.

Multi-Agent AI Systems

Improve agent orchestration and workflows
Diagnose failures across agent pipelines
Refine system prompts and agent interactions
Improve reliability, latency, and output quality

Applied ML Areas (Depth > Breadth)

You’ll contribute to one or more of the following areas:

Recommendation systems (ranking & personalization)
Itinerary optimization & constraint-based planning
LLM-based reasoning systems
(Optional) Computer vision pipelines

Depth in at least one of these areas is more important than shallow experience in many.

Engineering Environment

We use:

Golang (primary production language)
Python for ML workflows
Postgres, Redis, internal services

You don’t need to be a Go expert on day one, but you must be comfortable reading and modifying production code.

Backend engineers own infrastructure-heavy services — your focus is on AI system behavior, correctness, and measurable improvement.

What We’re Looking For

1️⃣ Strong AI/ML Fundamentals (Must-Have)

You understand the theory behind what you build and can choose appropriate methods.

Examples:

Evaluation metrics (precision, recall, F1, etc.)
Ranking & recommendation concepts
Embeddings and similarity
Experimentation methodology

Not required:

Academic publications
Advanced theoretical math
Large-scale model training experience

2️⃣ Evaluation-Driven Mindset (Most Important Signal)

You:

Think in metrics and baselines
Design experiments instead of guessing
Measure improvements quantitatively
Debug failures methodically

This is the most important quality for this role.

3️⃣ Experience with LLM Systems

You’ve worked with:

Prompt design
Agent workflows
Evaluating LLM outputs
Production LLM integrations

4️⃣ Ability to Ship Production Systems

You can:

Turn ideas into working systems
Iterate based on results
Balance exploration with delivery

5️⃣ Programming Ability

You’re comfortable writing production code in at least one language (Python, Go, or similar) and learning others as needed.

Strong Signals (Nice to Have)

Experience improving AI systems post-deployment
Recommendation or ranking system experience
Optimization / constraint-based systems
Computer vision experience
Experience building evaluation frameworks
Golang experience
Startup or small-team engineering background

Ideal Candidate

5+ years of experience in AI/ML (exceptionally strong 3+ year candidates considered)
Strong English communication skills
Comfortable working in a fast-moving environment
Focused on measurable impact, not just models

Required languages

English	B2 - Upper Intermediate
Ukrainian	Native

Python, Go

Published 16 February

46 views

11 applications

29% read

15% responded

Last responded 3 weeks ago

To apply for this and other jobs on Djinni login or signup.

Only from 3 years of experience
Full Remote
Countries of Europe or Ukraine
Countries where we consider candidates
- English B2 - Upper Intermediate
- Ukrainian Native

ML / AI

Employment: Fulltime
Domain: Other
Outstaff

Apply for the job

Last responded 3 weeks ago

29% read

15% responded

📊 $2500-4500 Average salary range of similar jobs in analytics →