Prompt Engineer - Health AI Classification (95%+ Accuracy)

Remote | Hourly Contract | Potential full-time

We're launching supplement analysis platform for 25,000+ of products but accuracy of data isn't good enough. Infrastructure is done. Prompts need fixing.

What you'll do:

Optimize 3 critical LLM classification tasks (data: filtering, categorization, extraction)
Achieve >95% accuracy with cost-effective models (e.g. Gemini, Grok)
Build regression test coverage in external testing enviroment
Analyze failure patterns and improve prompts systematically
Document methodology for team

Requirements:

Proven track record achieving >95% accuracy on complex classification
Experience with high-stakes AI (healthcare/finance/legal)
Systematic testing approach (not trial-and-error)
Multi-model expertise (GPT-4, Claude, Gemini, etc.)

Nice to have:

Healthcare/supplement domain knowledge
MySQL or similar familiarity
Structured output optimization (JSON mode, function calling)
Fine-tuning experience

We provide:

⁠Complete infrastructure (external testing enviroment, admin panels)
Multi-LLM access (Claude, GPT, Gemini, Grok, OpenRouter)
Data analyst for verification support
Clear success metrics (>95% accuracy)
Competitive hourly rate

To Apply:
•⁠ Write us message by clicking "Apply for the job"
•⁠ Fill out questioner here: https://tally.so/r/mBO904

Required languages

English

B2 - Upper Intermediate

Published 28 October

93 views

11 applications

30% read

10% responded

Last responded 1 week ago

To apply for this and other jobs on Djinni login or signup.

Only from 1 year of experience
Full Remote
Worldwide
Countries where we consider candidates
English B2 - Upper Intermediate

ML / AI

Employment: Part-time
Domain: Healthcare / MedTech
Startup

Apply for the job

Last responded 1 week ago

30% read

10% responded

📊 Average salary range of similar jobs in analytics →