Prompt Spec Engineer

$$$
Product

We are looking for a Prompt Spec Engineer who will define how our AI systems communicate, reason, and deliver high-quality outputs across products powered by LLMs.
 

You will be responsible for designing prompt/spec architectures, building evaluation systems, and ensuring that AI responses are consistent, reliable, and production-ready across RAG systems, AI agents, and automation workflows.


What you will do:
 

  • Design and evolve prompt/spec architecture for AI products, LLM services, RAG systems, and AI agents, defining prompt structures, model behavior rules, response formats, and quality criteria.
  • Develop, optimize, and adapt prompts for different LLMs and use cases, considering output quality, context limitations, cost, and system stability.
  • Build and maintain AI evaluation processes including test scenarios, golden datasets, regression testing, and A/B experiments for LLM performance measurement.
  • Analyze AI outputs, identify root causes of issues (hallucinations, missing context, retrieval problems, incorrect or unsafe responses), and define improvements.
  • Collaborate with AI Engineers and Product Managers to implement prompt solutions into production systems, RAG pipelines, AI agents, and automation workflows, translating business needs into technical specifications.
  • Work with LLM evaluation and monitoring tools (such as PromptFoo, Langfuse, and similar solutions) to track metrics, prompt versions, response quality, and system performance.
  • Create and maintain structured documentation: prompt specs, evaluation frameworks, usage guidelines, limitations, and best practices for safe and reliable AI behavior.


What we expect from you:
 

  • 2+ years of experience in LLM-related roles such as Prompt Engineering, AI Product Operations, AI Evaluation, or similar.
  • Hands-on experience designing and optimizing prompts for different LLMs with a strong understanding of model behavior, strengths, and limitations.
  • Experience building prompt/spec systems including prompt structure design, behavior rules, response formatting, testing, and versioning.
  • Strong experience in AI evaluation: test case design, output analysis, issue detection, and iterative improvement of model performance.
  • Understanding of RAG systems, AI agents, structured outputs, tool calling, and principles of reliable LLM system design.
  • Ability to collaborate with engineering and product teams, translate business requirements into AI specifications, and document solutions clearly and consistently.
  • Upper-Intermediate (B2+) English 
     

What we offer

 

  • Competitive compensation depending on experience
  • Work on production-level AI systems (LLMs, RAG, AI agents, automation workflows)
  • Office-based work in Lviv with the possibility of a hybrid schedule.
  • Direct impact on AI product quality and system behavior
  • Access to modern LLM tools and evaluation frameworks
  • Professional growth in a fast-moving AI-focused environment
  • Opportunity to shape prompt engineering standards inside the company
  • Relocation support for candidates from other cities, including assistance with moving and adaptation in Lviv.

Required languages

English B2 - Upper Intermediate
Ukrainian Native
Published 11 June
14 views
ยท
2 applications
To apply for this and other jobs on Djinni login or signup.
Loading...