AI Tooling and Infrastructure engineer

Hi! We're building a dedicated AI division — creating both internal tooling and customer-facing AI products. Startup energy, backed by our established 20+ year tech company. New York and Krakow hubs, horizontal structure, direct impact.

What You'll Do

- Build AI-powered tools for internal teams and external customers
- Work directly with LLMs: prompt engineering, evaluation, quality measurement
- Design multi-model solutions — choosing the right model for the right task
- Create APIs, frameworks, and infrastructure that make AI reliable in production
- Measure and improve AI output quality using evaluation frameworks and human feedback

What We're Looking For

Must have:

- You understand how LLMs behave — not just how to call their APIs
(You can explain why a model hallucinates and what to do about it)
- Hands-on experience with multiple model families
(Claude, GPT, Gemini, DeepSeek, open-source — at least 2-3)
- You've measured AI output quality in production
(Evaluation frameworks, LLM-as-judge, human validation — any approach counts)
- Prompt engineering as a craft, not a buzzword
(You iterate, test, and know why your prompts work)
- Backend development experience (Python preferred, any language accepted)
- 2+ years professional software development

Signs you're a great fit:

- You know the difference between GPT-4o, Claude Sonnet, and Gemini Flash
— and when to use which
- You've built something with AI that solves a real problem (not just a tutorial)
- You've improved AI output quality and can tell us how you measured it

Nice to have (we'll teach the rest):

- Experience building developer tools or frameworks
- Multi-agent systems (LangGraph, CrewAI, AutoGen)
- Knowledge of evaluation tooling (LangSmith, Ragas, DeepEval, custom solutions)
- Background in Java or high-performance systems

Required skills experience

LLM	1 year
AI/ML	1 year
Prompt Engineering	1 year
AI Agents	1 year
Python	1 year

Required languages

English

C1 - Advanced

Antrophic

Published 22 January · Updated 4 March

70 views

24 applications

Response activity: High

Last responded 5 days ago

To apply for this and other jobs on Djinni login or signup.

Only from 2 years of experience
Full Remote
Countries of Europe or Ukraine
Countries where we consider candidates
- English C1 - Advanced

ML / AI

LLM	1 year
AI/ML	1 year
Prompt Engineering	1 year

+ 2 more

Employment: Fulltime
Domain: Fintech
Product

Apply for the job

Response activity: High

Last responded 5 days ago

📊 Average salary range of similar jobs in analytics →