LLM QA Engineer / Analyst to $2500
We're looking for LLM QA Engineer/Analyst for analyzing and improving the quality of LLM results. This role involves collaborating with LLM engineers to enhance overall performance.
QA LLM engineer has basic knowledge of testing, training, and fine-tuning language models, make immediate assessment of results and necessary adjustments, helping identify the most suitable model for our needs in terms of quality and performance.
There are many good pre-trained models with good performance.
The work strategy for LLM QA involves:
- conducting quick tests,
- making comparisons,
- selecting the best pre-trained model for fine-tuning,
- using advanced LLMs like Anthropic or GPT-o1 to create a training dataset,
- fine-tuning our selected pre-trained model, like Llama, to achieve optimal results.
Key Responsibilities:
- Conduct automated and manual checks to verify model responses.
- Prepare automated data sets from raw data according to specified requirements.
- Create automation scripts for statistical analysis to handle typical requests, such as:
- Identifying the most and least popular items from a data set.
- Sorting or filtering data sets based on predefined criteria.
- Develop automated tests for large data sets, ensuring success based on simple criteria, such as predefined numeric or textual values, or ranges of values.
Technology stack
Python: Primary language for scripting, automation, and data analysis.
Machine Learning Frameworks
- PyTorch or TensorFlow: For model fine-tuning and training.
- Hugging Face Transformers: Access to various pre-trained language models.
Language Models
- OpenAI GPT-o1: Advanced LLM for creating training datasets.
- Anthropic Models: For high-performance language tasks.
- LLaMA: Selected pre-trained model for fine-tuning.
Data Processing and Analysis
- Pandas and NumPy: For data manipulation and analysis.
- SciPy and StatsModels: For statistical analysis.
Testing and Automation
- PyTest: Automated testing framework.
- Selenium: For any necessary web-based testing automation.
- Jenkins or GitHub Actions: Continuous Integration/Continuous Deployment (CI/CD) pipelines.
Candidate Selection Criteria:
- A portfolio demonstrating experience with similar tasks.
- Code examples of automation that can be reviewed.
- Platform: Linux. Language preferences include Python, Bash, Java, and JavaScript. Lesser interest in Ruby, Rust, Go, and other rising technologies.
Please respond to this job offer with list of your skills and experience that matching requirements to handle these tasks.