Jobs Data Science
89-
Β· 74 views Β· 4 applications Β· 6d
IT/Data management specialist to $500
Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 2 years of experience Β· English - B2About Keymakr Keymakr specializes in end-to-end dataset preparation, including video and image annotation and labeling for AI projects. We take a personalized approach, delivering customized solutions tailored to each clientβs needs. Our strong...About Keymakr
Keymakr specializes in end-to-end dataset preparation, including video and image annotation and labeling for AI projects.
We take a personalized approach, delivering customized solutions tailored to each clientβs needs.- Our strong in-house R&D team develops and adapts annotation tools for every client, allowing us to successfully handle even the most complex requirements.
About the Role:
We are looking for a Data Management Specialist to manage, process, and maintain large volumes of data and media assets used in AI and technology projects.
This role focuses on ensuring data integrity, consistency, and availability while optimizing workflows through automation and close collaboration with cross-functional teams.
Key Responsibilities:
- Manage and manipulate data using Linux file systems and native OS tools;
- Work with media files, including video formats, codecs, resolution, and frame rate (FPS);
- Verify data integrity using checksum and verification tools;
- Pack and unpack large data archives efficiently;
- Automate data-related workflows using scripting languages;
- Prepare, structure, and maintain datasets in structured formats;
- Maintain cloud-based data storage and perform data migration;
- Collaborate with Project, QA, and Technical teams to ensure smooth data operations;
- Clearly and consistently document workflows, processes, and task updates.
Technical Skills:
- Strong knowledge of Linux file systems and administration (Ubuntu/Debian, 2+ years);
- Experience with Windows OS administration;
- Proficiency in Python and Bash / Shell scripting;
- Experience with archive management tools for large datasets;
- Knowledge of checksum verification tools (md5sum, sha256sum);
- Tools for checking video properties, codecs, and formats;
- Solid knowledge of JSON and CSV formats;
- Experience with cloud storage platforms: AWS S3, Google Cloud Storage, or Azure;
- Familiarity with JIRA, Slack, Confluence, and Google Docs.
Nice to Have:
- Experience with point clouds / 3D data;
- Strong understanding of complex technical specifications;
- Experience working with large-scale datasets;
- English & Communication;
- English level B2 (Upper-Intermediate) or higher;
- Ability to read and understand technical documentation and project requirements;
- Clear written communication in English (reports, updates, documentation);
Participation in English-language meetings when required.
If you want, I can also:
- tailor it for remote / hybrid / office format,
- shorten it to fit DOUβs compact job posting style,
- or adjust it to a more technical / more business-oriented tone.
-
Β· 38 views Β· 5 applications Β· 18d
LLM Research Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B2We are seeking an experienced Data Scientist with a passion for large language models (LLMs) and cutting-edge AI research. In this role, you will design and prototype data preparation pipelines, collaborating closely with data engineers to transform your...We are seeking an experienced Data Scientist with a passion for large language models (LLMs) and cutting-edge AI research. In this role, you will design and prototype data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, design and implement a state-of-the-art evaluation and benchmarking framework to measure and guide model quality, and do end-to-end LLMs training. You will work alongside top AI researchers and engineers, ensuring our models are not only powerful but also aligned with user needs, cultural context, and ethical standards.
What you will do
- Curation of datasets for pre-training, supervised fine-tuning, and alignment;
- Research and develop best practices and novel techniques in LLM training and evaluation pipelines;
- Collaborate closely with data engineers, annotators, linguists, and domain experts to scale data processes, define evaluation tasks and collect high-quality feedback.
Qualifications and experience needed
Education & Experience:- 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP;
- An advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
GenAI & NLP Expertise:
- Practical experience with fine-tuning LLMs / VLMs models;
- Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
ML & Programming Skills:
- Strong experience with deep learning frameworks such as PyTorch or JAX for building models;
- Ability to write efficient, clean code and debug complex model issues.
A plus would be
Advanced NLP/ML Techniques:- Applied experience using Reinforcement Learning in NLP / LLM settings;
- Prior work on LLM safety, fairness, and bias mitigation;
- Experience generating and curating synthetic datasets for Supervised Fine-Tuning (SFT), including quality control and scaling considerations.
Research & Community:
- Publications in NLP/ML conferences or contributions to open-source NLP projects;
- Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicates a passion for staying at the forefront of the field.
MLOps & Infrastructure:
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow);
- Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models;
- Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training is a plus.
Problem-Solving:
- Innovative mindset with the ability to approach open-ended AI problems creatively;
- Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation
What we offer
- Office or remote β itβs up to you. You can work from anywhere, and we will arrange your workplace;
- Remote onboarding;
- Performance bonuses;
- We train employees with the opportunity to learn through the companyβs library, internal resources, and programs from partners;
- Health and life insurance;
- Wellbeing program and corporate psychologist;
- Reimbursement of expenses for Kyivstar mobile communication.
-
Β· 47 views Β· 8 applications Β· 21d
Machine Learning Engineer
Hybrid Remote Β· Argentina, Brazil, Poland, Romania, Ukraine Β· 5 years of experience Β· English - NoneAbout the role As an ML Engineer, youβll be responsible for building and operationalizing ML models, integrating them with existing data systems, and ensuring they perform reliably in production. You will collaborate with stakeholders to understand use...About the role
As an ML Engineer, youβll be responsible for building and operationalizing ML models, integrating them with existing data systems, and ensuring they perform reliably in production. You will collaborate with stakeholders to understand use cases, translate them into machine learning workflows, and design scalable pipelines that handle large volumes of data efficiently. By combining strong engineering skills with an understanding of applied ML, youβll ensure that predictive analytics, forecasting, and optimization models directly support the clientβs real estate, accounting, and investment reporting needs.
About the Ρlient:
Our client is a leading, $35B global commercial real estate services and investment firm. To accelerate its digital transformation, the organization is adopting Palantir Foundry as a core data and application platform. The focus is on building scalable solutions that showcase Foundryβs value, drive adoption, and create a pipeline of business-funded initiatives. These efforts aim to enable efficient data transformation and client-specific reporting at enterprise scale. Together, they will strengthen reporting capabilities, improve efficiency, and establish a foundation for data-driven applications across global operations.
β
Who are we looking for?
Skills & Experience
- Masterβs or PhD in Data Science, Computer Science, or a technology-focused field.
- 5+ years of hands-on experience designing and deploying AI or ML models.
- Hands-on experience setting up and maintaining large-scale data science and machine learning projects.
- Skilled with deep learning libraries like PyTorch, Keras, TensorFlow, and HuggingFace toolkits.
- Solid knowledge of machine learning, especially Natural Language Processing (NLP) large language models (LLMs).
- Advanced Python coding and strong experience (with tools like Scikit-Learn, NumPy, SciPy, Pandas, and XGBoost.)
- Comfortable using SQL for working with large datasets and uncovering insights.
- Experience creating solutions in cloud platforms such as AWS Sagemaker or Microsoft Azure.
- Able to work on your own or as part of a group to hit targets.
Nice to have
- Focused on getting results for clients and working in an agile development setup.
- Curiosity, attention to detail, and drive to solve difficult data problems.
- Can design or review how a modelβs success is measured to line up with business goals.
- Experience with LLMs in Palantir Foundry
- Shows initiative by researching new solutions with some guidance.
Responsibilities
- Analyze data to find patterns and create machine learning solutions for challenging business issues
- Understand the AI/ML program journey to formulate relevant high-impact business questions that can be answered through data analysis.
- Develop AI/ML solutions that can be scaled across various business use cases, starting from PoC to MVP and launching into production.
- Build and improve AI models - like those for prediction, automation, or natural language tasks.
- Use both in-house tools and the latest technology to increase operational productivity and efficiency as well as predictive analytics.
- Break down and share your process and results in a way thatβs clear to non-technical folks, like business managers and executives.
- Keep up with and put into practice the latest AI and machine learning techniques.
β
What we offer
Work:
- Flexible working hours;
- Collaborative, friendly team environment;
- Remote/Hybrid work;
Life:
- Company social events;
- Annual corporate parties;
Health:
- Comprehensive medical insurance;
Education:
- Allowances for professional education;
- English language courses with native speakers;
- Internal knowledge-sharing sessions.
-
Β· 35 views Β· 3 applications Β· 15d
Senior Data Scientist
Full Remote Β· Ukraine Β· 5 years of experience Β· English - NoneAbout the company Jappware is a software development company that delivers innovative and reliable digital solutions for international clients. We specialize in end-to-end product development β from ideation and design to architecture, development, and...About the company
Jappware is a software development company that delivers innovative and reliable digital
solutions for international clients.
We specialize in end-to-end product development β from ideation and design to architecture,
development, and DevOps support.
About the project
We are looking for a Senior Data Scientist to join our growing team in Lviv or remotely.
Weβre building a brand-new Real Estate platform with an AI-powered Lead Generation Pipeline
at its core.
This is a hands-on role combining Data Science, Analytics, and Data Engineeringβperfect for
someone who wants to build from scratch and influence product direction.
Responsibilities
β Build the end-to-end Lead Generation Pipeline for Real Estate
β Create and manage structured property feature sets
β Run EDA, modeling, and hypothesis testing
β Design and maintain ETL/ELT pipelines
β Work with feature stores (Parquet, etc.)
β Collaborate with engineering to integrate models into production
β Shape data architecture and validate product ideas through prototypes
Requirements
β 5+ years in Data Science
β Strong Python & SQL knowledge
β Proven ability to build analytical and data pipelines from scratch
β Hands-on, autonomous, proactive mindset
β Strong communication and analytical thinking skills
What we are offering
β Challenging and innovative environments.
β Flexible schedule and remote-friendly culture.
β 20 paid vacations and 15 sick leave days.
β Quarterly budget for learning & development activities.
β Team events, workshops, and internal tech meetups.
β IT Club membership.
Steps to Expect in Jappwareβs Hiring Process:
β Intro Interview
β Technical Interview
β Offer
Our Mission:
To build innovative software in trustworthy partnerships.
We aim to become a reliable and forward-thinking technology partner, helping businesses grow
through innovation and mutual trust.
Our Values
Trust β Every successful partnership is built on openness, honesty, and sincerity.
Openness β We encourage people to share ideas freely and foster transparent
communication.
Partnership β We treat our clientsβ and teammatesβ goals as our own.
Proactiveness β We act ahead of possible outcomes and anticipate challenges to deliver the
best results.
Social Responsibility
At Jappware, we stand with our people and our country.
We proudly support Ukraineβs resilience, innovation, and global contribution to the IT
community.
Through donations, volunteering, and social initiatives, we help strengthen our local
communities and the nationβs future.
Jappware stands with Ukraine β Glory to Ukraine!
Follow us via LinkedIn, DOU, Instagram, Facebook
More -
Β· 23 views Β· 7 applications Β· 19d
Senior Machine Learning Engineer
Hybrid Remote Β· Worldwide Β· Product Β· 4 years of experience Β· English - B2Senior Machine Learning Engineer Position Title: Senior Machine Learning Engineer Reports To: Project Management Team Direct Reports: None Location: Porto, Portugal Job Description We are looking for a Senior Machine Learning Engineer to join our...
Senior Machine Learning EngineerPosition Title: Senior Machine Learning Engineer
Reports To: Project Management Team
Direct Reports: None
Location: Porto, Portugal
Job Description
We are looking for a Senior Machine Learning Engineer to join our Data Science team. This is a senior role for an experienced professional with a proven record of building, deploying, and maintaining scalable ML systems in production environments. You will lead the ML infrastructure end-to-end β from model training and deployment to automation and monitoring β ensuring reliability, efficiency, and business impact. Your work will enable data-driven decisions at scale, guiding our teams toward smarter, faster, and more measurable outcomes.
Responsibilities
- Apply your engineering skills and in-depth knowledge to run applied statistics, ML infrastructure, model deployment, and production system design, with a focus on delivering inference from structured, tabular data.
- Building scalable ML pipeline automation, establishing MLOps best practices, and mentoring the development team on ML system architecture.
- Be an excellent communicator, capable of presenting outcomes and caveats of technical solutions to non-technical teams.
- Mentor engineers and establish technical best practices from scratch
- Share knowledge to expand the overall ML engineering capabilities of our organization.
- Maintain clear and comprehensive documentation of the work done, and keep all the critical information organized and easy to digest for both data and project team members
- Demonstrate commitment to staying current with the latest MLOps tools, infrastructure patterns, and production ML best practices.
Required Qualifications
- Bachelor's in Computer Science, Data Science, or related field.
- Minimum of 5 years of related experience with a Bachelor's degree, or 3 years with a Master's degree.
- Experience working with large-scale, structured datasets.
- Proven experience leading technical initiatives and defining ML infrastructure standards.
- Extensive experience with ML infrastructure projects involving model serving, ML pipeline automation, monitoring, and MLOps tooling.
- Excellent understanding of software engineering principles, system design, and ML model optimization for production environments.
- High proficiency with Python programming language and software engineering best practices
- High proficiency with Python libraries used to implement applied statistics (numpy, pandas, matplotlib, statsmodels, scikit-learn)
- High proficiency with SQL and experience with cloud-based data warehouses (BigQuery preferred) and data pipeline technologies (dbt preferred)
- Strong understanding of cloud infrastructure, containerization (Docker/Kubernetes), and distributed systems
- Excellent written and verbal communication skills with the ability to educate and influence technical teams
- Fluent English (spoken and written).
Nice to Have
- Experience working in the online advertising industry
- Knowledge of the film industry and its unique marketing and audience challenges
- Experience with Ruby on Rails full-stack development
About Gruvi
Gruvi is a data-driven media and insights agency dedicated to the film industry. We combine creativity, data, and proprietary technology to deliver impactful campaigns for film distributors and exhibitors worldwide. With an international presence and a team of media and advertising experts, we combine advertising campaigns, proprietary data to push the boundaries of digital media to help our clients drive meaningful results. We are passionate about film and committed to using cutting-edge insights to ensure great films find their audience.
More -
Β· 22 views Β· 2 applications Β· 13d
Senior/Middle Data Scientist
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
More
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will design and implement a state-of-the-art evaluation and benchmarking framework to measure and guide model quality, and personally train LLMs with a strong focus on Reinforcement Learning from Human Feedback (RLHF). You will work alongside top AI researchers and engineers, ensuring the models are not only powerful but also aligned with user needs, cultural context, and ethical standards.
Requirements:
Education & Experience:
- 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
- Proven experience in machine learning model evaluation and/or NLP benchmarking.
- Advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Good knowledge of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
- Familiarity with LLM training and fine-tuning techniques.
ML & Programming Skills:
- Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Solid understanding of RLHF concepts and related techniques (preference modeling, reward modeling, reinforcement learning).
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience creating and managing test datasets, including annotation and labeling processes.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
Communication:
- Experience working in a collaborative, cross-functional environment.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
Nice to have:
Advanced NLP/ML Techniques:
- Prior work on LLM safety, fairness, and bias mitigation.
- Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Knowledge of data annotation workflows and human feedback collection methods.
Research & Community:
- Publications in NLP/ML conferences or contributions to open-source NLP projects.
- Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
Domain & Language Knowledge:
- Familiarity with the Ukrainian language and context.
- Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
- Knowledge of Ukrainian benchmarks, or familiarity with other evaluation datasets and leaderboards for large models, can be an advantage given the projectβs focus.
MLOps & Infrastructure:
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
- Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
Problem-Solving:
- Innovative mindset with the ability to approach open-ended AI problems creatively.
- Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.
Responsibilities:
- Analyze benchmarking datasets, define gaps, and design, implement, and maintain a comprehensive benchmarking framework for the Ukrainian language.
- Research and integrate state-of-the-art evaluation metrics for factual accuracy, reasoning, language fluency, safety, and alignment.
- Design and maintain testing frameworks to detect hallucinations, biases, and other failure modes in LLM outputs.
- Develop pipelines for synthetic data generation and adversarial example creation to challenge the modelβs robustness.
- Collaborate with human annotators, linguists, and domain experts to define evaluation tasks and collect high-quality feedback
- Develop tools and processes for continuous evaluation during model pre-training, fine-tuning, and deployment.
- Research and develop best practices and novel techniques in LLM training pipelines.
- Analyze benchmarking results to identify model strengths, weaknesses, and improvement opportunities.
- Work closely with other data scientists to align training and evaluation pipelines.
- Document methodologies and share insights with internal teams.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 61 views Β· 3 applications Β· 29d
Senior/Middle Data Scientist
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will focus on designing and prototyping data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, and actively developing model training pipelines with other talented data scientists. Your work will directly shape the quality and capabilities of the models by ensuring we feed them the highest-quality, most relevant data possible.
Requirements:
Education & Experience:
- 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
- Proven experience in data preprocessing, cleaning, and feature engineering for large-scale datasets of unstructured data (text, code, documents, etc.).
- Advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Good knowledge of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
- Familiarity with LLM training and fine-tuning techniques.
ML & Programming Skills:
- Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
Communication & Personality:
- Experience working in a collaborative, cross-functional environment.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
- Ability to rapidly prototype and iterate on ideas
Nice to have:
Advanced NLP/ML Techniques:
- Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Understanding of FineWeb2 or similar processing pipelines approach.
Research & Community:
- Publications in NLP/ML conferences or contributions to open-source NLP projects.
- Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
Domain & Language Knowledge:
- Familiarity with the Ukrainian language and context.
- Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
- Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
MLOps & Infrastructure:
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
- Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
Problem-Solving:
- Innovative mindset with the ability to approach open-ended AI problems creatively.
- Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.
Responsibilities:
- Design, prototype, and validate data preparation and transformation steps for LLM training datasets, including cleaning and normalization of text, filtering of toxic content, de-duplication, de-noising, detection and deletion of personal data, etc.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Analyze large-scale raw text, code, and multimodal data sources for quality, coverage, and relevance.
- Develop heuristics, filtering rules, and cleaning techniques to maximize training data effectiveness.
- Collaborate with data engineers to hand over prototypes for automation and scaling.
- Research and develop best practices and novel techniques in LLM training pipelines.
- Monitor and evaluate data quality impact on model performance through experiments and benchmarks.
- Research and implement best practices in large-scale dataset creation for AI/ML models.
- Document methodologies and share insights with internal teams.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 19 views Β· 1 application Β· 29d
Senior Data Scientist/NLP Lead
Office Work Β· Ukraine (Kyiv) Β· Product Β· 5 years of experience Β· English - B2About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior Data Scientist / NLP Lead to spearhead the development of cutting-edge natural language processing solutions for the Ukrainian LLM project. You will lead the NLP team in designing, implementing, and deploying large-scale language models and NLP algorithms that power the products.This role is critical to the mission of advancing AI in the Ukrainian language context, and offers the opportunity to drive technical decisions, mentor a team of data scientists, and shape the future of AI capabilities in Ukraine.
Requirements:
Education & Experience:
- 5+ years of experience in data science or machine learning, with a strong focus on NLP.
- Proven track record of developing and deploying NLP or ML models at scale in production environments.
- An advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Deep understanding of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, text classification, sequence tagging (NER), and transformers/LLMs.
- Deep understanding of transformer architectures and knowledge of LLM training and fine-tuning techniques, hands-on experience developing solutions on LLM, and knowledge of linguistic nuances in Ukrainian or other languages.
Advanced NLP/ML Techniques:
- Experience with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Background in information retrieval or RAG (Retrieval-Augmented Generation) is a plus for building systems that augment LLMs with external knowledge.
ML & Programming Skills:
- Proficiency in Python and common data science libraries (pandas, NumPy, scikit-learn).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Experience on how to build a representative benchmarking framework given business requirements for LLM.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP or Azure) and big data technologies (Spark, Hadoop) for scaling data processing or model training is a plus.
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
Leadership & Communication:
- Demonstrated ability to lead technical projects and mentor junior team members.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.Responsibilities:
- Lead end-to-end development of NLP and LLM models - from data exploration and model prototyping to validation and production deployment. This includes designing novel model architectures or fine-tuning state-of-the-art transformer models (e.g., BERT, GPT) to solve project-specific language tasks.
- Analyze large text datasets (Ukrainian and multilingual corpora) to extract insights and build robust training datasets.
- Guide data collection and annotation efforts to ensure high-quality data for model training.
- Develop and implement NLP algorithms for a range of tasks such as text classification, named entity recognition, semantic search, and conversational AI.
- Stay up-to-date with the latest research to apply transformer-based models, embeddings, and other modern NLP techniques in the solutions.
- Establish evaluation metrics and validation frameworks for model performance, including accuracy, factuality, and bias.
- Design A/B tests and statistical experiments to compare model variants and validate improvements.
- Deploy and integrate NLP models into production systems in collaboration with engineers - ensuring models are scalable, efficient, and well-monitored in a real-world setting.
- Optimize model inference and troubleshoot issues such as model drift or data pipeline bottlenecks.
- Provide technical leadership and mentorship to the NLP/ML team.
- Review code and research, uphold best practices in ML (version control, reproducibility, documentation), and foster a culture of continuous learning and innovation.
- Collaborate cross-functionally with product managers, software engineers, and MLOps engineers to align NLP solutions with product goals and infrastructure capabilities.
- Communicate complex data science concepts to stakeholders and incorporate their feedback into model development.The company offers:
More
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 13 views Β· 1 application Β· 29d
Data scientist with Java expertise
Full Remote Β· Ukraine Β· 5 years of experience Β· English - B2Project Description: The primary goal of the project is the modernization, maintenance and development of an eCommerce platform for a big US-based retail company, serving millions of omnichannel customers each week. Solutions are delivered by several...Project Description:
The primary goal of the project is the modernization, maintenance and development of an eCommerce platform for a big US-based retail company, serving millions of omnichannel customers each week.
Solutions are delivered by several Product Teams focused on different domains - Customer, Loyalty, Search and Browse, Data Integration, Cart.
Current overriding priorities are new brands onboarding, re-architecture, database migrations, migration of microservices to a unified cloud-native solution without any disruption to business.Responsibilities:
We are looking for an experienced Data Engineer with Machine Learning expertise and good understanding of search engines, to work on the following:
- Design, develop, and optimize semantic and vector-based search solutions leveraging Lucene/Solr and modern embeddings.
- Apply machine learning, deep learning, and natural language processing techniques to improve search relevance and ranking.
- Develop scalable data pipelines and APIs for indexing, retrieval, and model inference.
- Integrate ML models and search capabilities into production systems.
- Evaluate, fine-tune, and monitor search performance metrics.
- Collaborate with software engineers, data engineers, and product teams to translate business needs into technical implementations.
- Stay current with advancements in search technologies, LLMs, and semantic retrieval frameworks.Mandatory Skills Description:
- 5+ years of experience in Data Science or Machine Learning Engineering, with a focus on Information Retrieval or Semantic Search.
- Strong programming experience in both Java and Python (production-level code, not just prototyping).
- Deep knowledge of Lucene, Apache Solr, or Elasticsearch (indexing, query tuning, analyzers, scoring models).
- Experience with Vector Databases, Embeddings, and Semantic Search techniques.
- Strong understanding of NLP techniques (tokenization, embeddings, transformers, etc.).
- Experience deploying and maintaining ML/search systems in production.
- Solid understanding of software engineering best practices (CI/CD, testing, version control, code review).Nice-to-Have Skills Description:
- Experience of work in distributed teams, with US customers
- Experience with LLMs, RAG pipelines, and vector retrieval frameworks.
- Knowledge of Spring Boot, FastAPI, or similar backend frameworks.
- Familiarity with Kubernetes, Docker, and cloud platforms (AWS/Azure/GCP).
- Experience with MLOps and model monitoring tools.
- Contributions to open-source search or ML projects.Languages:
English: B2 Upper Intermediate
More -
Β· 22 views Β· 0 applications Β· 29d
Data Scientist / Machine Learning Engineer - AI at Massive Scale
Hybrid Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1Help us push AI further β and faster LoopMeβs Data Science team builds production AI that powers real-time decisions for campaigns seen by hundreds of millions of people every day. We process billions of data points daily β and we donβt just re-apply old...Help us push AI further β and faster
LoopMeβs Data Science team builds production AI that powers real-time decisions for campaigns seen by hundreds of millions of people every day. We process billions of data points daily β and we donβt just re-apply old tricks. We design and deploy genuinely novel machine learning systems, from idea to prototype to production.
Youβll join a high-trust team that has a 5-star Glassdoor rating led by Leonard Newnham, where your work moves fast, ships to production, and makes measurable impact.
What youβll do:
- Design, build, and run large-scale ML pipelines that process terabytes of data
- Apply a mix of supervised learning, custom algorithms, and statistical modelling to real-world problems
- Ship production-grade Python code thatβs clear, documented, and tested
- Work in small, agile squads (3β4 people) with DS, ML, and engineering peers
- Partner with product and engineering to take models from idea β production β impact
- Work with Google Cloud, Docker, Kafka, Spark, Airflow, ElasticSearch, ClickHouse and more
What you bring:
- Bachelorβs degree in Computer Science, Maths, Engineering, Physics or similar (MSc/PhD a plus)
- 3+ yearsβ commercial Python experience
- Track record building ML pipelines that handle large-scale data
- Excellent communication skills β comfortable working across time zones
- A curious, scientific mindset β you ask βwhy?β and prove the answer
Bonus if you have:
- Experience with adtech or real-time bidding
- Agile / Scrum experience
- Knowledge of high-availability infrastructure (ElasticSearch, Kafka, ClickHouse)
- Airflow expertise
About the Data Science Team:
Weβre 17 ML engineers, data scientists, and data engineers, distributed across London, Poland, and Ukraine β acting as one team, not a satellite office.
What sets us apart:
- Led by an experienced Chief Data Scientist who codes, leads, and listens
- Inclusive, supportive culture where ideas are heard and people stay
- Strong values: open communication, continual innovation, fair treatment, and high standards
- Track record of publishing award-winning research in automated bidding
Donβt just take our word for it β check our Glassdoor reviews (search βData Scientistβ) for a real view of the culture.
About LoopMe:
LoopMe was founded to close the loop on brand advertising. Our platform combines AI, mobile data, and attribution to deliver measurable brand outcomes β from purchase intent to foot traffic. Founded in 2012, we now have offices in New York, London, Chicago, LA, Dnipro, Singapore, Beijing, Dubai and more.
What we offer:
- Competitive salary + bonus
- Billions of real-world data points to work with daily
- Flexible remote/hybrid options
- Learning budget and career growth support
- Friendly, transparent culture with strong leadership
Hiring process:
- Intro with Talent Partner
- 30-min technical interview with Chief Data Scientist
- Panel with 2 team members (technical, culture & collaboration)
- Offer β usually within 48 hours of final round
Are you ready to design and deploy AI systems that run at truly massive scale?
More -
Β· 43 views Β· 9 applications Β· 28d
Data Scientist
Full Remote Β· Countries of Europe or Ukraine Β· 3 years of experience Β· English - B2We are looking for an experienced Data Scientist for a full-time job to join our team. Requirements: - 2+ years of experience as a data scientist - BSc in Mathematics, Statistics, Computer Science, Economics, or another related field. - Experience in...We are looking for an experienced Data Scientist for a full-time job to join our team.
More
Requirements:
- 2+ years of experience as a data scientist
- BSc in Mathematics, Statistics, Computer Science, Economics, or another related field.
- Experience in using Python & SQL
- Experience with Airflow and GCP
- Experience with Git & CI/CD
- Upper Intermediate English β written and spoken
- Ability to design creative solutions for complex requirements
- Ability to learn and lead projects independently, and to work with minimal supervision with customers (tech & business)
Responsibilities:
- Conduct independent research, including defining research problems, creating research plans, designing experiments, developing algorithms, implementing code, and performing comprehensive comparisons against existing benchmarks;
- Clearly communicate your research findings to both technical and non-technical audiences
- Work on various data sources and apply sophisticated feature engineering capabilities
- Bring and use business knowledge
- Build and manage technical relationships with customers and partners.
We offer:
- remote time job, B2B contract
- 12 sick leaves and 18 paid vacation business days per year
- Comfortable work conditions (including MacBook Pro and Dell monitor on each workplace)
- Smart environment
- Interesting projects from renowned clients
- Flexible work schedule
- Competitive salary according to the qualifications
- Guaranteed full workload during the term of the contract
-
Β· 28 views Β· 5 applications Β· 28d
Middle (Middle+) Data Scientist / ML Engineer
Hybrid Remote Β· Worldwide Β· 3 years of experience Β· English - B2We are looking for a full-time Middle Data Scientist / ML Engineer to join our team and take ownership of data-driven components in new products we are building from scratch. One of the first phases involves creating a data model of an industrial...We are looking for a full-time Middle Data Scientist / ML Engineer to join our team and take ownership of data-driven components in new products we are building from scratch.
One of the first phases involves creating a data model of an industrial facility (digital twin) based on historical sensor and operational data, and developing predictive algorithms on top of it. The solution will include a data pipeline, a lightweight system model, forecasting modules, and integration with a simple dashboard.
This is a role for a specialist who communicates clearly, understands data deeply, and can work independently.Responsibilities
- Analyse real industrial datasets and define structure, schemas and preprocessing logic.
- Build data pipelines and prepare clean datasets for modelling.
- Develop interpretable ML/time-series forecasting models.
- Participate in designing a simplified system model (data-driven βdigital twinβ).
- Prepare models for inference and assist with integration into backend services.
- Collaborate closely with PM and other team members, clarify requirements, and communicate modelling decisions.
- Produce clear, concise documentation for data and models.
Requirements
- 3+ years of experience as a Data Scientist or ML Engineer.
- Strong Python skills (pandas, scikit-learn; PyTorch/TensorFlow).
- Experience with time-series modelling or forecasting.
- Solid understanding of data cleaning, feature engineering and evaluation.
- Ability to work independently and take ownership of tasks without supervision.
- Good communication skills and ability to work with evolving or partially defined requirements.
- Experience with Docker and backend integration is a plus.
Nice to Have
- Experience with industrial or IoT data
- Experience with FastAPI or simple dashboards
- Understanding of model deployment workflows
-
Β· 75 views Β· 5 applications Β· 27d
Middle Data Scientist (Operations Digital Twin)
Full Remote Β· Worldwide Β· Product Β· 2 years of experience Β· English - B2 Ukrainian Product πΊπ¦About us Fozzy Group is one of the largest trade industrial groups in Ukraine and one of the leading Ukrainian retailers, with over 700 outlets all around the country. It is also engaged in e-commerce, food processing & production, agricultural...About us
Fozzy Group is one of the largest trade industrial groups in Ukraine and one of the leading Ukrainian retailers,
with over 700 outlets all around the country. It is also engaged in e-commerce, food processing & production,
agricultural business, parcel delivery, logistics and banking.
Since its inception in 1997, Fozzy Group has focused on making innovative business improvements, creating
new opportunities for the market and further developing the industry as a whole.
Job Description:
The Foodtech team is looking for a Data Scientist to develop the Operational Analytics function for a fast[1]growing food delivery business. In this role, you will focus on time series forecasting, regression modeling,
simulation modeling, and end-to-end machine learning pipelines to support resource planning and
operational decision-making.
You will be responsible for developing simulation-based models that serve as a foundation for a digital twin
of operational processes, enabling scenario analysis, stress testing, and what-if simulations for capacity
planning and operational optimization.
You will work closely with product, engineering, and operations teams to transform data into measurable
business impact through production-ready ML and simulation solutions.
Job Responsibilities
β’ Develop and implement time series forecasting models for resource planning (demand, capacity,
couriers, delivery slots, operational load);
β’ Build regression and machine learning models to explain key drivers and support operational
decisions;
β’ Apply a wide range of time series approaches from classical models (SARIMA, ETS, Prophet) and
ML models (GB) to modern Deep Learning models (LSTM, Temporal CNNs, Transformers for TS);
β’ Design, build, and maintain end-to-end automated ML pipelines, deploy and operate models in
production using AWS SageMaker;
β’ Orchestrate training and inference workflows with Apache Airflow;
β’ Analyze large-scale operational datasets and convert results into insights, forecasts, and actionable
recommendations;
β’ Collaborate with product managers, engineers, and operations teams to define business problems
and validate analytical solutions;
β’ Monitor model performance, forecast stability, and business impact over time.
Requirements
β’ Bachelorβs Degree in Mathematics / Engineering / Computer Sciences / Quantitative Economics /
Econometrics;
β’ Strong mathematical background in Linear algebra, Probability, Statistics & Optimization Techniques;
β’ At least 2 years working experience on Data Science;
β’ Experience of the full cycle of model implementation (data collection, model training and evaluation,
model deployment and monitoring);
β’ Ability to work independently, proactively, and to decompose complex problems into actionable tasks.
Skills
Must Have
β’ Strong proficiency in Python with solid application of object-oriented programming (OOP) principles
(modular design, reusable components, maintainable code);
β’ Solid experience in time series forecasting and regression modeling;
β’ Practical knowledge of:
o Classical and ML forecasting techniques;
o Statistical methods (hypothesis testing, confidence intervals, A/B testing);
β’ Advanced SQL skills (window functions, complex queries);
β’ Experience building automated ML pipelines;
β’ Understanding of MLOps principles (model versioning, monitoring, CI/CD for ML).
Preferred
β’ Hands-on experience with AWS SageMaker (training jobs, endpoints, model registry);
β’ Experience with Apache Airflow for data and ML workflow orchestration;
β’ Knowledge of Reporting and Business Intelligence Software (Power BI, Tableau);
β’ Experience working with large-scale production data systems.
What We Offer
β’ Competitive salary;
β’ Professional & personal development opportunities;
β’ Being part of dynamic team of young & ambitious professionals;
β’ Corporate discounts for sport clubs and language courses;
β’ Medical insurance package
More -
Β· 29 views Β· 4 applications Β· 26d
Data Scientist
Full Remote Β· Ukraine Β· 5 years of experience Β· English - B2We are looking for you! We are seeking a Senior Data Scientist to drive the next generation of data-driven solutions. This role calls for deep expertise in data architecture, advanced analytics, and pipeline design. If you are a seasoned professional...We are looking for you!
We are seeking a Senior Data Scientist to drive the next generation of data-driven solutions. This role calls for deep expertise in data architecture, advanced analytics, and pipeline design. If you are a seasoned professional ready to lead initiatives, innovate with cutting-edge techniques, and deliver impactful data solutions, weβd be excited to have you join our journey.
Contract type: Gig contract
Skills and experience you can bring to this role
Qualifications & experience:
- At least 3 years of commercial experience with Python, Data Stack (NumPy, Pandas, scikit-learn) and web stack (Fast API / Flask / Django);
- Familiarity with one or more machine learning frameworks (XGBoost, TensorFlow, PyTorch);
- Strong mathematical and statistical skills;
- Good Understanding of SQL/RDBMS and familiarity with data warehouses (BigQuery, Snowflake, Redshift, etc.);
- Experience building ETL data pipelines (Airflow, Prefect, Dagster, etc);
- Knowledge of Amazon Web Services (AWS) ecosystem (S3, Glue, Athena);
- Experience with at least one MMM or marketing analytics framework (e.g., Robyn, PyMC Merydian or similar);
- Strong communication skills to explain technical insights to non-technical stakeholders.
Nice to have:
- Knowledge of digital advertising platforms (Google Ads, DV360, Meta, Amazon, etc.) and campaign performance metrics;
- Exposure to clean rooms (Google Ads Data Hub, Amazon Marketing Cloud);
- Familiarity with industry and syndicated data sources (Nielsen, Kantar etc);
- Experience with optimisation techniques (budget allocation, constrained optimisation);
- Familiarity with gen AI (ChatGPT APIs/agents, prompt engineering, RAG, vector databases).
Educational requirements:
- Bachelorβs degree in Computer Science, Information Systems, or a related discipline is preferred. A Master's degree or higher is a distinct advantage.
What impact youβll make
- Build and validate marketing measurement models (e.g., MMM, attribution) to understand the impact of media spend on business outcomes;
- Develop and maintain data pipelines and transformations to prepare campaign, performance, and contextual data for modelling;
- Run exploratory analyses to uncover trends, correlations, and drivers of campaign performance;
- Support the design of budget optimisation and scenario planning tools;
- Collaborate with engineers, analysts, and planners to operationalise models into workflows and dashboards;
- Translate model outputs into clear, actionable recommendations for client and internal teams.
What youβll get
Regardless of your position or role, we have a wide array of benefits in place, including flexible working (hybrid/remote models) and generous time off policies (unlimited vacations, sick and parental leaves) to make it easier for all people to thrive and succeed at Star. On top of that, we offer an extensive reward and compensation package, intellectually and creatively stimulating space, health insurance and unique travel opportunities.
Your holistic well-being is central at Star. You'll join a warm and vibrant multinational environment filled with impactful projects, career development opportunities, mentorship and training programs, fun sports activities, workshops, networking and outdoor meet-ups.
More -
Β· 27 views Β· 8 applications Β· 25d
Middle Machine Learning Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B2Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the Role:
Weβre looking for a mid-level AI Engineer to help test, deploy, and integrate cutting-edge generative AI models into production experiences centered around human avatars and 3D content. Youβll work directly with the CEO to turn R&D prototypes into stable, scalable products.
Responsibilities:
- Experiment with and evaluate generative models for:- Human avatar creation and animation;
- 3D reconstruction and modeling;
- Gaussian splattingβbased pipelines;
- Generalized NeRF (Neural Radiance Fields) techniques.
- Turn research code and models into production-ready services (APIs, microservices, or batch pipelines).
- Build and maintain Python-based tooling for data preprocessing, training, evaluation, and inference.
- Design and optimize cloud-based deployment workflows (e.g., containers, GPUs, inference endpoints, job queues).
- Integrate models into user-facing applications in collaboration with product, design, and frontend teams.
- Monitor model performance, reliability, and cost; propose and implement improvements.
- Stay up-to-date on relevant research and help prioritize which techniques to test and adopt.
Required Qualifications:
- 3β5+ years of experience as an ML/AI Engineer or similar role.
- Strong Python skills and experience with one or more deep learning frameworks (PyTorch preferred).
- Hands-on experience with deploying ML models to cloud environments (AWS, GCP, Azure, or similar) including containers (Docker) and basic CI/CD workflows.
- Familiarity with 3D data formats and pipelines (meshes, point clouds, volumetric representations, etc.).
- Practical exposure to one or more of the following (professional or serious personal projects):- NeRFs or NeRF-like methods;
- Gaussian splatting / 3D Gaussian fields;
- Avatar generation / face-body reconstruction / pose estimation;
- Comfort working in an iterative, fast-paced environment directly with leadership (reporting to CEO).
More
Nice-to-Haves:
- Experience with real-time rendering pipelines (e.g., Unity, Unreal, WebGL) or GPU programming (CUDA).
- Experience optimizing inference performance and cost (model distillation, quantization, batching)
- Background in computer vision, graphics, or related fields (academic or industry).