Labelwise

Prompt Engineer - Health AI Classification (95%+ Accuracy)

Remote | Hourly Contract | Potential full-time

We're launching supplement analysis platform for 25,000+ of products but accuracy of data isn't good enough. Infrastructure is done. Prompts need fixing.

What you'll do:

  • Optimize 3 critical LLM classification tasks (data: filtering, categorization, extraction)
  • Achieve >95% accuracy with cost-effective models (e.g. Gemini, Grok)
  • Build regression test coverage in external testing enviroment
  • Analyze failure patterns and improve prompts systematically
  • Document methodology for team

 

Requirements:

  • Proven track record achieving >95% accuracy on complex classification
  • Experience with high-stakes AI (healthcare/finance/legal)
  • Systematic testing approach (not trial-and-error)
  • Multi-model expertise (GPT-4, Claude, Gemini, etc.)

 

Nice to have:

  • Healthcare/supplement domain knowledge
  • MySQL or similar familiarity
  • Structured output optimization (JSON mode, function calling)
  • Fine-tuning experience

 

We provide:

  • ⁠Complete infrastructure (external testing enviroment, admin panels)
  • Multi-LLM access (Claude, GPT, Gemini, Grok, OpenRouter)
  • Data analyst for verification support
  • Clear success metrics (>95% accuracy)
  • Competitive hourly rate
     

To Apply:
•⁠  Write us message by clicking "Apply for the job"
•⁠  Fill out questioner here: https://tally.so/r/mBO904 

Required languages

English B2 - Upper Intermediate
Published 28 October
93 views
·
11 applications
30% read
·
10% responded
Last responded 1 week ago
To apply for this and other jobs on Djinni login or signup.
Loading...