Senior LLM engineer - custom LLMs creation and fine-tuning

Custom LLM Models Creation and Fine-Tuning

(in japanese and english social content)

 

The project aims to develop a custom machine learning model to 

1) accurately detect country names and 

2) determine if texts pertain to Japanese national elections

... improving upon common issues found in standard language models.

 

To respond to this offer, list the competencies needed for this task and confirm they align with your skills and experience. 

 

Introduction

Custom model tuning aims to enhance accuracy in entity detection and handling special cases. Standard models like Llama, Mistral, Llama3, QWQ, and Gemma often face several issues:

1. Inaccurate or vague responses unsuitable for extracting entity names or feature values.
2. Inconsistent response formats that complicate parsing.
3. Incorrect outputs in feature detection and attribute tasks.
4. Semantic errors in entity detection and evaluation tasks.

 

The custom model tuning attempt addresses problem #3. This is a relatively simple task for LLM, typically solved quickly (less than 2 seconds per request) when filtering arrays of 10-100 thousand elements.

 

Task Examples: 

1. Identify countries referenced in the text, either directly or indirectly. 

2. Assess whether the text pertains to national elections in Japan. 

Both tasks analyze Japanese or mixed Japanese-English texts up to 500 characters from Twitter, encompassing official news, personal opinions, and dialogues. 

 

INPORTANT NOTE: Identifying a country's name in the text is straightforward and can be effectively handled with regular expressions, as it's an NLP task rather than an LLM's. However, the challenge lies in detecting indirect references to a country, as specified in the task definition. For instance, mentions like the "White House" or a USA political party point to the USA, while the name of a Japanese political party indicates Japan. This complexity is the main challenge of the task. 

 

An explicit mention of the name of a country, language, or nationality, without using it as an object or subject that is the main part of the sentence or semantic agent, is also preferable not to include in the list. 

 

 Country List Detection with LLM model: 

Common errors: 

  • Including geographical locations (regions, prefectures) 
  • Including continents and organization names 
  • Missing indirect references 
  • Incorrect detection based on political parties, positions, politician names 

Result: delimited list of country names in Japanese (0 to 4 elements) 

 
Election Topic Detection with LLM model: 

Task: determine if Japanese national elections are the main text topic 

 

Positive cases include: voting processes, preparation, election campaigns, program discussions 

Common errors: 

  • Wrong country identification 
  • Incorrect election type detection 
  • False positives on keyword mentions 

Result: clear boolean format (YES/NO)  

 

Platforms and Tools

 

The main analysis tool is Ollama, utilized as both a CLI and REST HTTP service for experimental research and routine processing. Ideally, the custom model should be manageable via Ollama, though this is not mandatory. If Ollama is unavailable, the model must offer a REST HTTP service for local deployment on a dedicated server.

 

Datasets of 10 000 items will be provided, along with small sets (10-100) of typical error cases for selective testing.

That is, we need to split the task into two stages or steps: 

  1. Preparation of quality datasets for training 
  2. Training, tuning, or creating a model from scratch. 

_______

 

Canditate requirements

 

The candidate should have strong experience with these specific tools rather than a broader but shallower knowledge of many frameworks. This focused approach aligns better with the project's specific goals of building a fast, accurate multilingual classification system.

 

Machine Learning & NLP Expertise:

  • Strong background in Natural Language Processing (NLP) and text classification
  • Experience with multilingual text processing (specifically Japanese and English)
  • Proficiency in developing and fine-tuning machine learning models
  • Knowledge of modern language models and their applications

 

Programming & Tools:

  • Proficiency in Python (or C++, Ruby, JavaScript, or another language platform depending on a framework and model basics) and relevant ML/NLP libraries
  • Experience with text processing and classification frameworks
  • Familiarity with large language models (LLMs)

 

Data Processing Skills:

  • Experience in handling multilingual datasets
  • Ability to work with various data formats and sources
  • Knowledge of data cleaning and preprocessing techniques
  • Experience with social media data processing (particularly Twitter/X data)

 

Performance Optimization:

  • Ability to optimize models for speed (requirement of 2-second response time)
  • Experience in handling large-scale data processing (10,000-100,000 items)
  • Skills in model optimization and efficiency improvement

 

Task-Specific Experience:

  • Text classification and entity recognition (specifically for country detection)
  • Context-based classification (such as election-related content detection)
  • Experience with short-text classification (500 characters or less)

 

Language Requirements:

  • Proficiency in Japanese language processing
  • Experience with mixed language content (Japanese-English)
  • Understanding of multilingual NLP challenges

 

Education and Experience:

  • Master's or Ph.D. in AI LLMs
  • Minimum 3-5 years of experience in LLMs ML/NLP development
  • Demonstrated experience with similar text classification projects, examples of fine tuned or created from scratch specialized models 

 

Additional Desired Qualifications:

  • Experience with Japanese language NLP tools and frameworks
  • Knowledge of social media content analysis
  • Background in building production-ready ML systems
  • Understanding of ML model deployment and scaling

 

1.  Experience with Major LLM Platforms and Their Tools:

•  Meta's LLaMA ecosystem (especially LLaMA 2)

•  Experience with Ollama deployment and management

•  Knowledge of other major LLMs: OpenAI API, Anthropic Claude, Cohere

•  Understanding of open-source LLM deployment and fine-tuning

 

2.  Fine-tuning and Adaptation Skills:

•  Experience in adapting pre-trained models for specific tasks

•  Knowledge of efficient fine-tuning techniques (LoRA, QLoRA, PEFT)

•  Understanding of prompt engineering and few-shot learning

•  Experience with model quantization and optimization

 

3.  Practical Skills:

•  Ability to evaluate and choose appropriate base models

•  Experience in model deployment and serving

•  Knowledge of cost-effective approaches to model adaptation

•  Understanding of inference optimization techniques

 

4.  Task-Specific Requirements:

•  Experience with multilingual models (Japanese-English specifically)

•  Knowledge of entity recognition fine-tuning

•  Understanding of context classification 

•  Experience with short text processing optimization

 

The task of extracting country names from text called Named Entity Recognition or NER, and their extraction from text, if explicitly present, is Named Entity Extraction (NEE). And Named Entity Identification (NEI) if entities are not explicitly mentioned and need to be formulated from context. 

 

Manual verification of each result is very important because otherwise, even a small percentage of incorrect data can significantly damage the trained model and reduce the statistical quality of its future performance. 

 

1st task is preparing quality samples for model training. This will focus more on the end result, namely obtaining quality datasets with verified results for both tasks: the list of countries and the topic about elections in Japan. 

 

We expect quality datasets of various sizes suitable for LLM training as output. 


We can stipulate that the first small dataset (0.5-2K) should be provided within a week, and larger ones (>5K) later... 

Sample dataset will be provided in JSON format, filtered by topic, and likely to contain target entities: 
 

- 55K element archive 
- 1.5K element dataset with results from different models for the country list task might be useful for comparison examples - as we already have. 
 

The task should be into two stages or steps: 

  1. Preparation of quality datasets for training 
  2. Training, tuning, or creating a model from scratch. 
To apply for this and other jobs on Djinni login or signup.