Jobs Data Science

95
  • Β· 41 views Β· 5 applications Β· 15d

    Data Scientist – Autonomous Systems

    Hybrid Remote Β· Worldwide Β· Product Β· 3 years of experience Β· English - None MilTech πŸͺ–
    We are seeking a Data Scientist with a strong foundation in physics, control theory, and mathematical modeling to join our team working on cutting-edge autonomous systems. The ideal candidate combines analytical rigor with practical experience in...

    We are seeking a Data Scientist with a strong foundation in physics, control theory, and mathematical modeling to join our team working on cutting-edge autonomous systems. The ideal candidate combines analytical rigor with practical experience in modeling, simulation, and algorithm development for autonomous platforms.

    Levels: Middle and Senior (responsibilities and scope will be adjusted accordingly).

    Key Responsibilities

    • Develop and validate mathematical models for autonomous systems and dynamic environments.
    • Apply data-driven approaches for system identification, optimization, and predictive control.
    • Analyze large datasets from sensors and simulations to extract insights and improve system performance.
    • Design and implement algorithms for control, navigation, and decision-making.
    • Collaborate with cross-functional teams to integrate models into real-world autonomous platforms.

    Required Qualifications

    • 3+ years in R&D or applied data science/software development.
    • Strong background in mathematical modeling, system identification, and control theory.
    • Proficiency in Matlab/Simulink for modeling and simulation.
    • Experience in signal processing and data analysis.
    • Programming skills in Python and C++.
    • Ability to quickly research and apply recent trends in control theory, autonomous systems, and data-driven modeling.
    • Relevant work experience or education in STEM field

    Nice to Have

    • Knowledge of aerodynamics fundamentals.
    • Experience with Machine Learning (e.g., reinforcement learning, predictive modeling).
    • Familiarity with simulation tools such as Gazebo, AirSim.
    • Hands-on experience with SITL/HITL testing.
    • Exposure to flight control stacks like PX4, Betaflight, ArduPilot.
    More
  • Β· 55 views Β· 2 applications Β· 15d

    Junior Data Scientist – Autonomous Systems (Computer Vision)

    Office Work Β· Ukraine (Kyiv) Β· Product Β· 2 years of experience Β· English - None MilTech πŸͺ–
    We are looking for a Junior Data Scientist eager to grow in the field of autonomous systems, with a focus on computer vision, control theory, and data-driven modeling. This role is ideal for someone with strong analytical skills and a passion for applying...

    We are looking for a Junior Data Scientist eager to grow in the field of autonomous systems, with a focus on computer vision, control theory, and data-driven modeling. This role is ideal for someone with strong analytical skills and a passion for applying data science to real-world autonomy challenges.

    Levels: Junior and Strong Junior (responsibilities and scope will be adjusted accordingly).

    Key Responsibilities

    • Assist in developing vision-based algorithms for perception and navigation.
    • Support data analysis and sensor fusion for multi-sensor systems.
    • Contribute to modeling and simulation tasks under guidance from senior engineers.
    • Work with datasets from cameras, IMUs, and other sensors to extract insights.
    • Stay up-to-date with recent research in computer vision, autonomy, and data science.

    Required Qualifications

    • 2+ years of experience in computer vision or data analysis.
    • Understanding of geometric computer vision principles.
    • Basic knowledge of control theory, PID controllers, signal processing, and data-driven modeling.
    • Programming skills in Python and C++.
    • Familiarity with Linux and single-board computers.
    • Strong willingness to learn and adapt quickly.
    • Relevant work experience or education in STEM field

    Nice to Have

    • Exposure to SLAM or Visual-Inertial Odometry (VIO).
    • Familiarity with OpenCV, NumPy, and basic ML frameworks (PyTorch, TensorFlow).
    • Knowledge of ROS2, Gazebo, or AirSim.
    • Experience with PX4, Betaflight, or ArduPilot.
    • Basic understanding of neural networks and CV frameworks.
    • Interest in reinforcement learning or predictive modeling.
    More
  • Β· 24 views Β· 2 applications Β· 17d

    Machine Learning Engineer

    Hybrid Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - None
    As a Machine Learning Engineer, you'll work as part of the Wix CTO Office team, researching problems that can give Wix’s products a competitive edge across various challenges. A key focus of the team is on Agents over LLMs, exploring new techniques for...

    As a Machine Learning Engineer, you'll work as part of the Wix CTO Office team, researching problems that can give Wix’s products a competitive edge across various challenges. A key focus of the team is on Agents over LLMs, exploring new techniques for building agents and developing innovative products that leverage them.  

     

    In your day-to-day, you will:  

    • Build POCs for research projects led by the team  
    • Evaluate results and provide actionable insights  
    • Collaborate with different teams at Wix to advance their agent implementations  
    • Build shared infrastructure for agents  

       

    Requirements

    • Creativity and willingness to tackle ambitious, high-risk problems  
    • 3+ years of experience in working on production code with active users  
    • BSc in Computer Science or related field, MSc preferred  
    • Proficient in Python; TypeScript is a significant advantage  
    • Experience in training and evaluating Machine Learning models  
    • Hands-on experience with building GenAI systems using LLMs and agents  
    • Proven ability to work in a collaborative, cross-functional environment  
    • Excellent written and verbal communication skills in English 

     

    About the Team  

    We are Wix's Data Science CTO Office team, a small group of researchers and engineers. We collaborate with various groups at Wix and the CEO on innovative research projects. Some projects aim to enhance Wix products with new features, while others focus on strategic research areas that can provide Wix with a competitive advantage.

     

    More
  • Β· 40 views Β· 5 applications Β· 17d

    Machine Learning Researcher

    Full Remote Β· Ukraine Β· 5 years of experience Β· English - B2
    About the Project Our client is a leader in AI-powered performance marketing, operating across 25+ verticals with unmatched precision, speed, and scale. Their proprietary technology stack integrates seamlessly with major media platforms, enabling...

    About the Project

    Our client is a leader in AI-powered performance marketing, operating across 25+ verticals with unmatched precision, speed, and scale. Their proprietary technology stack integrates seamlessly with major media platforms, enabling real-time event-level data exchange, optimization, and attribution.
    At the core of their operation is a deep commitment to AI-driven decision-making. From real-time bidding engines and predictive lead scoring to campaign automation and anomaly detection, their in-house AI models are central to how we scale campaigns, reduce inefficiencies, and outperform market benchmarks.
    They’ve built and continue to evolve a robust internal platform to empower media buyers, analysts, and operators with real-time alerts, smart recommendations, and semi-autonomous optimization tools.

     

    Role Summary:

    We’re looking for an experienced ML researcher to own the full lifecycle of machine learning projects β€” from problem formulation and research through production deployment and monitoring. You will design, build, and deploy ML models, mainly on tabular data, with full ownership over their production performance and business impact.

     

    Required qualifications:
    ● Strong hands on experience with ML models for tabular data and deep understanding of underlying methodologies
    ● Hands-on experience experience with end-to-end project ownership from research to production
    ● Proven ability to extract predictive signal from complex, messy real-world data at scale
    ● Experience training models on Big Data and optimizing for inference latency
    ● Experience with ML cloud-based platforms and MLOps tools and practices (experiment tracking, model versioning, deployment pipelines)
    ● Strong proven Python skills and familiarity with ML packages for tabular data processing (scikit-learn, PyTorch, pandas, polars etc.)
    ● Solid understanding of experimental design, causality and model validation
    ● Experience working closely with data engineering pipelines

     

    Preferred Qualifications:
    ● BA in statistics, ML, computer science or related fields
    ● Experience with causal inference methods, uplift modeling, A/B testing
    ● Familiarity with modern LLM APIs (OpenAI, Anthropic, Google)
    ● Experience packaging models, building inference endpoints, and optimizing latency
    ● Exposure to drift detection, data quality checks, and performance monitoring
    ● Experience with containerization (Docker) and serving frameworks (FastAPI, Flask, TorchServe, BentoML, etc.

    More
  • Β· 66 views Β· 2 applications Β· 13d

    Data Scientist

    Office Work Β· Ukraine (Kyiv) Β· Product Β· 1.5 years of experience Β· English - None
    We are looking for the first Data Science engineers to join our team and play a foundational role in building a private Large Language Model from the ground up. This is a rare opportunity to shape the technical direction, influence architectural...

    We are looking for the first Data Science engineers to join our team and play a foundational role in building a private Large Language Model from the ground up. This is a rare opportunity to shape the technical direction, influence architectural decisions, and establish best practices for intelligent systems within the organization.

    What you will do: 

    • Design, build, and maintain classical machine learning models to support core product needs.
    • Develop agent-based systems (LLM agents, multi-step agents, agent-to-agent workflows) as part of the broader LLM ecosystem.
    • Conduct rigorous experiments and iterate quickly to validate approaches and improve system performance.
    • Build and optimize inference pipelines, including performance tuning, monitoring, and alerting.
    • Collaborate closely with Data Engineering to deploy models securely and reliably into production environments.
    • Contribute to defining standards, workflows, and tooling for the new Data Science function.

       

    What we expect:

    • 3+ years of hands-on experience with Python.
    • Strong SQL skills and understanding of data workflows.
    • Solid grasp of machine learning methods and their practical applications.
    • Experience building agents or agent-based systems is a strong plus.
    • Initiative, ownership, and readiness to work in a greenfield environment where many solutions must be defined from scratch.

     

    More
  • Β· 64 views Β· 4 applications Β· 2d

    IT/Data management specialist to $500

    Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 2 years of experience Β· English - B2
    About Keymakr Keymakr specializes in end-to-end dataset preparation, including video and image annotation and labeling for AI projects. We take a personalized approach, delivering customized solutions tailored to each client’s needs. Our strong...

    About Keymakr

    • Keymakr specializes in end-to-end dataset preparation, including video and image annotation and labeling for AI projects.
      We take a personalized approach, delivering customized solutions tailored to each client’s needs.

       

    • Our strong in-house R&D team develops and adapts annotation tools for every client, allowing us to successfully handle even the most complex requirements.

     

    About the Role:

    • We are looking for a Data Management Specialist to manage, process, and maintain large volumes of data and media assets used in AI and technology projects.

       

    • This role focuses on ensuring data integrity, consistency, and availability while optimizing workflows through automation and close collaboration with cross-functional teams.

       

    Key Responsibilities:

    1. Manage and manipulate data using Linux file systems and native OS tools;
    2. Work with media files, including video formats, codecs, resolution, and frame rate (FPS);
    3. Verify data integrity using checksum and verification tools;
    4. Pack and unpack large data archives efficiently;
    5. Automate data-related workflows using scripting languages;
    6. Prepare, structure, and maintain datasets in structured formats;
    7. Maintain cloud-based data storage and perform data migration;
    8. Collaborate with Project, QA, and Technical teams to ensure smooth data operations;
    9. Clearly and consistently document workflows, processes, and task updates.

     

    Technical Skills:

    1. Strong knowledge of Linux file systems and administration (Ubuntu/Debian, 2+ years);
    2. Experience with Windows OS administration;
    3. Proficiency in Python and Bash / Shell scripting;
    4. Experience with archive management tools for large datasets;
    5. Knowledge of checksum verification tools (md5sum, sha256sum);
    6. Tools for checking video properties, codecs, and formats;
    7. Solid knowledge of JSON and CSV formats;
    8. Experience with cloud storage platforms: AWS S3, Google Cloud Storage, or Azure;
    9. Familiarity with JIRA, Slack, Confluence, and Google Docs.

     

    Nice to Have:

    1. Experience with point clouds / 3D data;
    2. Strong understanding of complex technical specifications;
    3. Experience working with large-scale datasets;
    4. English & Communication;
    5. English level B2 (Upper-Intermediate) or higher;
    6. Ability to read and understand technical documentation and project requirements;
    7. Clear written communication in English (reports, updates, documentation);
    8. Participation in English-language meetings when required.

       

    If you want, I can also:

    • tailor it for remote / hybrid / office format,
    • shorten it to fit DOU’s compact job posting style,
    • or adjust it to a more technical / more business-oriented tone.
    More
  • Β· 32 views Β· 4 applications Β· 6d

    Middle Data Scientist (NLP)

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B2
    We are seeking an experienced Data Scientist with a passion for large language models (LLMs) and cutting-edge AI research. In this role, you will design and prototype data preparation pipelines, collaborating closely with data engineers to transform your...

    We are seeking an experienced Data Scientist with a passion for large language models (LLMs) and cutting-edge AI research. In this role, you will design and prototype data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, design and implement a state-of-the-art evaluation and benchmarking framework to measure and guide model quality, and do end-to-end LLMs training. You will work alongside top AI researchers and engineers, ensuring our models are not only powerful but also aligned with user needs, cultural context, and ethical standards.

    What you will do

    • Design, prototype, and validate data preparation and transformation steps for LLM training datasets, including cleaning and normalization of text, filtering of toxic content, de-duplication, de-noising, detection and deletion of personal data, etc.
    • Form specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    • Develop heuristics, filtering rules, adversarial examples, and synthetic data generation methods to maximize model robustness and data effectiveness.
    • Research and develop best practices and novel techniques in LLM training pipelines.
    • Develop metrics, tools and processes for continuous evaluation during model pre-training, fine-tuning, and deployment.
    • Analyze benchmarking datasets, define gaps and design, implement, and maintain comprehensive benchmarking framework for Ukrainian language.
    • Monitor and analyze the impact of data quality and benchmark results on model performance, identifying strengths, weaknesses, and improvement opportunities.
    • Collaborate closely with data engineers, annotators, linguists, and domain experts to scale data processes, define evaluation tasks and collect high-quality feedback.
    • Document methodologies, experimental findings, and best practices, and share insights across internal teams to ensure alignment of training and evaluation workflows.

    Qualifications and experience needed

    Education & Experience:

    • 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
    • Proven experience in end-to-end ML, NLP model development, including data preparation and evaluation.
    • An advanced degree (Master’s or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.

    NLP Expertise:

    • Good knowledge of natural language processing techniques and algorithms.
    • Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
    • Familiarity with LLM training and fine-tuning techniques.

    ML & Programming Skills:

    • Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
    • Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
    • Ability to write efficient, clean code and debug complex model issues.

    Data & Analytics:

    • Solid understanding of data analytics and statistics.
    • Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
    • Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.

    Deployment & Tools:

    • Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
    • Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
    • Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training is a plus.

    Communication:

    • Experience working in a collaborative, cross-functional environment.
    • Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
    • Ability to rapidly prototype and iterate on ideas

    A plus would be

    Advanced NLP/ML Techniques:

    • Solid understanding of RLHF concepts and related techniques (preference modeling, reward modeling, reinforcement learning).
    • Prior work on LLM safety, fairness, and bias mitigation.
    • Ability to rapidly prototype and iterate on ideas
    • Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
    • Knowledge of data annotation workflows and human feedback collection methods.

    Research & Community:

    • Publications in NLP/ML conferences or contributions to open-source NLP projects.
    • Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicates a passion for staying at the forefront of the field.

    Domain & Language Knowledge:

    • Familiarity with the Ukrainian language and context.
    • Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
    • Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given our project’s focus.
    • Knowledge of Ukrainian benchmarks, or familiarity with other evaluation datasets and leaderboards for large models, can be an advantage given our project’s focus.

    MLOps & Infrastructure:

    • Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
    • Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.

    Problem-Solving:

    • Innovative mindset with the ability to approach open-ended AI problems creatively.
    • Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.

    What we offer

    • Office or remote β€” it’s up to you. You can work from anywhere, and we will arrange your workplace
    • Remote onboarding
    • Performance bonuses
    • We train employees with the opportunity to learn through the company’s library, internal resources, and programs from partners
    • Health and life insurance
    • Wellbeing program and corporate psychologist
    • Reimbursement of expenses for Kyivstar mobile communication
    More
  • Β· 45 views Β· 8 applications Β· 9d

    Machine Learning Engineer

    Hybrid Remote Β· Argentina, Brazil, Poland, Romania, Ukraine Β· 5 years of experience Β· English - None
    About the role As an ML Engineer, you’ll be responsible for building and operationalizing ML models, integrating them with existing data systems, and ensuring they perform reliably in production. You will collaborate with stakeholders to understand use...

    About the role

    As an ML Engineer, you’ll be responsible for building and operationalizing ML models, integrating them with existing data systems, and ensuring they perform reliably in production. You will collaborate with stakeholders to understand use cases, translate them into machine learning workflows, and design scalable pipelines that handle large volumes of data efficiently. By combining strong engineering skills with an understanding of applied ML, you’ll ensure that predictive analytics, forecasting, and optimization models directly support the client’s real estate, accounting, and investment reporting needs.
     

    About the сlient:

    Our client is a leading, $35B global commercial real estate services and investment firm. To accelerate its digital transformation, the organization is adopting Palantir Foundry as a core data and application platform. The focus is on building scalable solutions that showcase Foundry’s value, drive adoption, and create a pipeline of business-funded initiatives. These efforts aim to enable efficient data transformation and client-specific reporting at enterprise scale. Together, they will strengthen reporting capabilities, improve efficiency, and establish a foundation for data-driven applications across global operations.

    ‍

    Who are we looking for?

    Skills & Experience

    • Master’s or PhD in Data Science, Computer Science, or a technology-focused field.
    • 5+  years of hands-on experience designing and deploying AI or ML models.
    • Hands-on experience setting up and maintaining large-scale data science and machine learning projects.
    • Skilled with deep learning libraries like PyTorch, Keras, TensorFlow, and HuggingFace toolkits.
    • Solid knowledge of machine learning, especially Natural Language Processing (NLP)  large language models (LLMs).
    • Advanced Python coding and strong experience (with tools like Scikit-Learn, NumPy, SciPy, Pandas, and XGBoost.)
    • Comfortable using SQL for working with large datasets and uncovering insights.
    • Experience creating solutions in cloud platforms such as AWS Sagemaker or Microsoft Azure.
    • Able to work on your own or as part of a group to hit targets.

    Nice to have

    • Focused on getting results for clients and working in an agile development setup.
    • Curiosity, attention to detail, and drive to solve difficult data problems.
    • Can design or review how a model’s success is measured to line up with business goals.
    • Experience with LLMs in Palantir Foundry 
    • Shows initiative by researching new solutions with some guidance.

    Responsibilities

    • Analyze data to find patterns and create machine learning solutions for challenging business issues 
    • Understand the AI/ML program journey to formulate relevant high-impact business questions that can be answered through data analysis.
    • Develop AI/ML solutions that can be scaled across various business use cases, starting from PoC to MVP and launching into production.
    • Build and improve AI models - like those for prediction, automation, or natural language tasks.
    • Use both in-house tools and the latest technology to increase operational productivity and efficiency as well as predictive analytics.
    • Break down and share your process and results in a way that’s clear to non-technical folks, like business managers and executives.
    • Keep up with and put into practice the latest AI and machine learning techniques.

    ‍

    What we offer

    Work:

    • Flexible working hours;
    • Collaborative, friendly team environment;
    • Remote/Hybrid work;

    Life:

    • Company social events;
    • Annual corporate parties;

    Health:

    • Comprehensive medical insurance;

    Education:

    • Allowances for professional education;
    • English language courses with native speakers;
    • Internal knowledge-sharing sessions.
    More
  • Β· 22 views Β· 2 applications Β· 3d

    Senior Data Scientist

    Full Remote Β· Ukraine Β· 5 years of experience Β· English - None
    About the company Jappware is a software development company that delivers innovative and reliable digital solutions for international clients. We specialize in end-to-end product development β€” from ideation and design to architecture, development, and...

    About the company

    Jappware is a software development company that delivers innovative and reliable digital
    solutions for international clients.
    We specialize in end-to-end product development β€” from ideation and design to architecture,
    development, and DevOps support.

    About the project
    We are looking for a Senior Data Scientist to join our growing team in Lviv or remotely.
    We’re building a brand-new Real Estate platform with an AI-powered Lead Generation Pipeline
    at its core.
    This is a hands-on role combining Data Science, Analytics, and Data Engineeringβ€”perfect for
    someone who wants to build from scratch and influence product direction.

    Responsibilities
    ● Build the end-to-end Lead Generation Pipeline for Real Estate
    ● Create and manage structured property feature sets
    ● Run EDA, modeling, and hypothesis testing
    ● Design and maintain ETL/ELT pipelines
    ● Work with feature stores (Parquet, etc.)
    ● Collaborate with engineering to integrate models into production
    ● Shape data architecture and validate product ideas through prototypes

    Requirements
    ● 5+ years in Data Science
    ● Strong Python & SQL knowledge
    ● Proven ability to build analytical and data pipelines from scratch
    ● Hands-on, autonomous, proactive mindset
    ● Strong communication and analytical thinking skills

    What we are offering
    ● Challenging and innovative environments.
    ● Flexible schedule and remote-friendly culture.
    ● 20 paid vacations and 15 sick leave days.
    ● Quarterly budget for learning & development activities.
    ● Team events, workshops, and internal tech meetups.
    ● IT Club membership.

    Steps to Expect in Jappware’s Hiring Process:
    ● Intro Interview
    ● Technical Interview
    ● Offer

    Our Mission:

    To build innovative software in trustworthy partnerships.
    We aim to become a reliable and forward-thinking technology partner, helping businesses grow
    through innovation and mutual trust.

    Our Values

    Trust β€” Every successful partnership is built on openness, honesty, and sincerity.
    Openness β€” We encourage people to share ideas freely and foster transparent
    communication.
    Partnership β€” We treat our clients’ and teammates’ goals as our own.
    Proactiveness β€” We act ahead of possible outcomes and anticipate challenges to deliver the
    best results.

    Social Responsibility
    At Jappware, we stand with our people and our country.
    We proudly support Ukraine’s resilience, innovation, and global contribution to the IT
    community.
    Through donations, volunteering, and social initiatives, we help strengthen our local
    communities and the nation’s future.

    Jappware stands with Ukraine β€” Glory to Ukraine!

    Follow us via LinkedIn, DOU, Instagram, Facebook

     

    More
  • Β· 16 views Β· 5 applications Β· 7d

    Senior Machine Learning Engineer

    Hybrid Remote Β· Worldwide Β· Product Β· 4 years of experience Β· English - B2
    Senior Machine Learning Engineer Position Title: Senior Machine Learning Engineer Reports To: Project Management Team Direct Reports: None Location: Porto, Portugal Job Description We are looking for a Senior Machine Learning Engineer to join our...


    Senior Machine Learning Engineer

    Position Title: Senior Machine Learning Engineer 

    Reports To: Project Management Team

    Direct Reports: None

    Location: Porto, Portugal

     

    Job Description

     

    We are looking for a Senior Machine Learning Engineer to join our Data Science team. This is a senior role for an experienced professional with a proven record of building, deploying, and maintaining scalable ML systems in production environments. You will lead the ML infrastructure end-to-end β€” from model training and deployment to automation and monitoring β€” ensuring reliability, efficiency, and business impact. Your work will enable data-driven decisions at scale, guiding our teams toward smarter, faster, and more measurable outcomes.

     

    Responsibilities

     

    • Apply your engineering skills and in-depth knowledge to run applied statistics, ML infrastructure, model deployment, and production system design, with a focus on delivering inference from structured, tabular data.
    • Building scalable ML pipeline automation, establishing MLOps best practices, and mentoring the development team on ML system architecture.
    • Be an excellent communicator, capable of presenting outcomes and caveats of technical solutions to non-technical teams. 
    • Mentor engineers and establish technical best practices from scratch
    • Share knowledge to expand the overall ML engineering capabilities of our organization.
    • Maintain clear and comprehensive documentation of the work done, and keep all the critical information organized and easy to digest for both data and project team members
    • Demonstrate commitment to staying current with the latest MLOps tools, infrastructure patterns, and production ML best practices.

       

    Required Qualifications

     

    • Bachelor's in Computer Science, Data Science, or related field.
    • Minimum of 5 years of related experience with a Bachelor's degree, or 3 years with a Master's degree.
    • Experience working with large-scale, structured datasets.
    • Proven experience leading technical initiatives and defining ML infrastructure standards.
    • Extensive experience with ML infrastructure projects involving model serving, ML pipeline automation, monitoring, and MLOps tooling.
    • Excellent understanding of software engineering principles, system design, and ML model optimization for production environments.
    • High proficiency with Python programming language and software engineering best practices
    • High proficiency with Python libraries used to implement applied statistics (numpy, pandas, matplotlib, statsmodels, scikit-learn)
    • High proficiency with SQL and experience with cloud-based data warehouses (BigQuery preferred) and data pipeline technologies (dbt preferred)
    • Strong understanding of cloud infrastructure, containerization (Docker/Kubernetes), and distributed systems
    • Excellent written and verbal communication skills with the ability to educate and influence technical teams
    • Fluent English (spoken and written).

       

    Nice to Have

    • Experience working in the online advertising industry
    • Knowledge of the film industry and its unique marketing and audience challenges
    • Experience with Ruby on Rails full-stack development  

     

    About Gruvi 

    Gruvi is a data-driven media and insights agency dedicated to the film industry. We combine creativity, data, and proprietary technology to deliver impactful campaigns for film distributors and exhibitors worldwide. With an international presence and a team of media and advertising experts, we combine advertising campaigns, proprietary data to push the boundaries of digital media to help our clients drive meaningful results. We are passionate about film and committed to using cutting-edge insights to ensure great films find their audience.

     

     

    More
  • Β· 15 views Β· 2 applications Β· 1d

    Senior/Middle Data Scientist (Benchmarking/Alignment)

    Hybrid Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - None
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will design and implement a state-of-the-art evaluation and benchmarking framework to measure and guide model quality, and personally train LLMs with a strong focus on Reinforcement Learning from Human Feedback (RLHF). You will work alongside top AI researchers and engineers, ensuring the models are not only powerful but also aligned with user needs, cultural context, and ethical standards.

    Requirements:
    Education & Experience:
    - 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
    - Proven experience in machine learning model evaluation and/or NLP benchmarking.
    - Advanced degree (Master’s or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
    NLP Expertise:
    - Good knowledge of natural language processing techniques and algorithms.
    - Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
    - Familiarity with LLM training and fine-tuning techniques.
    ML & Programming Skills:
    - Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
    - Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
    - Solid understanding of RLHF concepts and related techniques (preference modeling, reward modeling, reinforcement learning).
    - Ability to write efficient, clean code and debug complex model issues.
    Data & Analytics:
    - Solid understanding of data analytics and statistics.
    - Experience creating and managing test datasets, including annotation and labeling processes.
    - Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
    - Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
    Deployment & Tools:
    - Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
    - Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
    - Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
    Communication:
    - Experience working in a collaborative, cross-functional environment.
    - Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.

    Nice to have:
    Advanced NLP/ML Techniques:
    - Prior work on LLM safety, fairness, and bias mitigation.
    - Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
    - Knowledge of data annotation workflows and human feedback collection methods.
    Research & Community:
    - Publications in NLP/ML conferences or contributions to open-source NLP projects.
    - Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
    Domain & Language Knowledge:
    - Familiarity with the Ukrainian language and context.
    - Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
    - Knowledge of Ukrainian benchmarks, or familiarity with other evaluation datasets and leaderboards for large models, can be an advantage given the project’s focus.
    MLOps & Infrastructure:
    - Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
    - Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
    Problem-Solving:
    - Innovative mindset with the ability to approach open-ended AI problems creatively.
    - Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.

    Responsibilities:
    - Analyze benchmarking datasets, define gaps, and design, implement, and maintain a comprehensive benchmarking framework for the Ukrainian language.
    - Research and integrate state-of-the-art evaluation metrics for factual accuracy, reasoning, language fluency, safety, and alignment.
    - Design and maintain testing frameworks to detect hallucinations, biases, and other failure modes in LLM outputs.
    - Develop pipelines for synthetic data generation and adversarial example creation to challenge the model’s robustness.
    - Collaborate with human annotators, linguists, and domain experts to define evaluation tasks and collect high-quality feedback
    - Develop tools and processes for continuous evaluation during model pre-training, fine-tuning, and deployment.
    - Research and develop best practices and novel techniques in LLM training pipelines.
    - Analyze benchmarking results to identify model strengths, weaknesses, and improvement opportunities.
    - Work closely with other data scientists to align training and evaluation pipelines.
    - Document methodologies and share insights with internal teams.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 14 views Β· 0 applications Β· 1d

    Senior/Middle Data Scientist (Data Preparation/Pre-training)

    Hybrid Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - None
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...


    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will focus on designing and prototyping data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, and actively developing model training pipelines with other talented data scientists. Your work will directly shape the quality and capabilities of the models by ensuring we feed them the highest-quality, most relevant data possible.

    Requirements:
    Education & Experience:
    - 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
    - Proven experience in data preprocessing, cleaning, and feature engineering for large-scale datasets of unstructured data (text, code, documents, etc.).
    - Advanced degree (Master’s or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
    NLP Expertise:
    - Good knowledge of natural language processing techniques and algorithms.
    - Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
    - Familiarity with LLM training and fine-tuning techniques.
    ML & Programming Skills:
    - Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
    - Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
    - Ability to write efficient, clean code and debug complex model issues.
    Data & Analytics:
    - Solid understanding of data analytics and statistics.
    - Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
    - Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
    Deployment & Tools:
    - Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
    - Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
    - Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
    Communication & Personality:
    - Experience working in a collaborative, cross-functional environment.
    - Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
    - Ability to rapidly prototype and iterate on ideas

    Nice to have:
    Advanced NLP/ML Techniques:
    - Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
    - Understanding of FineWeb2 or similar processing pipelines approach.
    Research & Community:
    - Publications in NLP/ML conferences or contributions to open-source NLP projects.
    - Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
    Domain & Language Knowledge:
    - Familiarity with the Ukrainian language and context.
    - Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
    - Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
    MLOps & Infrastructure:
    - Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
    - Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
    Problem-Solving:
    - Innovative mindset with the ability to approach open-ended AI problems creatively.
    - Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.

    Responsibilities:
    - Design, prototype, and validate data preparation and transformation steps for LLM training datasets, including cleaning and normalization of text, filtering of toxic content, de-duplication, de-noising, detection and deletion of personal data, etc.
    - Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    - Analyze large-scale raw text, code, and multimodal data sources for quality, coverage, and relevance.
    - Develop heuristics, filtering rules, and cleaning techniques to maximize training data effectiveness.
    - Collaborate with data engineers to hand over prototypes for automation and scaling.
    - Research and develop best practices and novel techniques in LLM training pipelines.
    - Monitor and evaluate data quality impact on model performance through experiments and benchmarks.
    - Research and implement best practices in large-scale dataset creation for AI/ML models.
    - Document methodologies and share insights with internal teams.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 60 views Β· 3 applications Β· 17d

    Senior/Middle Data Scientist

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will focus on designing and prototyping data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, and actively developing model training pipelines with other talented data scientists. Your work will directly shape the quality and capabilities of the models by ensuring we feed them the highest-quality, most relevant data possible.

    Requirements:
    Education & Experience:
    - 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
    - Proven experience in data preprocessing, cleaning, and feature engineering for large-scale datasets of unstructured data (text, code, documents, etc.).
    - Advanced degree (Master’s or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
    NLP Expertise:
    - Good knowledge of natural language processing techniques and algorithms.
    - Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
    - Familiarity with LLM training and fine-tuning techniques.
    ML & Programming Skills:
    - Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
    - Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
    - Ability to write efficient, clean code and debug complex model issues.
    Data & Analytics:
    - Solid understanding of data analytics and statistics.
    - Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
    - Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
    Deployment & Tools:
    - Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
    - Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
    - Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
    Communication & Personality:
    - Experience working in a collaborative, cross-functional environment.
    - Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
    - Ability to rapidly prototype and iterate on ideas

    Nice to have:
    Advanced NLP/ML Techniques:
    - Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
    - Understanding of FineWeb2 or similar processing pipelines approach.
    Research & Community:
    - Publications in NLP/ML conferences or contributions to open-source NLP projects.
    - Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
    Domain & Language Knowledge:
    - Familiarity with the Ukrainian language and context.
    - Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
    - Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
    MLOps & Infrastructure:
    - Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
    - Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
    Problem-Solving:
    - Innovative mindset with the ability to approach open-ended AI problems creatively.
    - Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.

    Responsibilities:
    - Design, prototype, and validate data preparation and transformation steps for LLM training datasets, including cleaning and normalization of text, filtering of toxic content, de-duplication, de-noising, detection and deletion of personal data, etc.
    - Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    - Analyze large-scale raw text, code, and multimodal data sources for quality, coverage, and relevance.
    - Develop heuristics, filtering rules, and cleaning techniques to maximize training data effectiveness.
    - Collaborate with data engineers to hand over prototypes for automation and scaling.
    - Research and develop best practices and novel techniques in LLM training pipelines.
    - Monitor and evaluate data quality impact on model performance through experiments and benchmarks.
    - Research and implement best practices in large-scale dataset creation for AI/ML models.
    - Document methodologies and share insights with internal teams.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 19 views Β· 1 application Β· 17d

    Senior Data Scientist/NLP Lead

    Office Work Β· Ukraine (Kyiv) Β· Product Β· 5 years of experience Β· English - B2
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
     

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
     

    About the role:
    We are looking for an experienced Senior Data Scientist / NLP Lead to spearhead the development of cutting-edge natural language processing solutions for the Ukrainian LLM project. You will lead the NLP team in designing, implementing, and deploying large-scale language models and NLP algorithms that power the products.

    This role is critical to the mission of advancing AI in the Ukrainian language context, and offers the opportunity to drive technical decisions, mentor a team of data scientists, and shape the future of AI capabilities in Ukraine.
     

    Requirements:
    Education & Experience:
    - 5+ years of experience in data science or machine learning, with a strong focus on NLP.
    - Proven track record of developing and deploying NLP or ML models at scale in production environments.
    - An advanced degree (Master’s or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.

    NLP Expertise:
    - Deep understanding of natural language processing techniques and algorithms.
    - Hands-on experience with modern NLP approaches, including embedding models, text classification, sequence tagging (NER), and transformers/LLMs.
    - Deep understanding of transformer architectures and knowledge of LLM training and fine-tuning techniques, hands-on experience developing solutions on LLM, and knowledge of linguistic nuances in Ukrainian or other languages.

    Advanced NLP/ML Techniques:
    - Experience with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
    - Background in information retrieval or RAG (Retrieval-Augmented Generation) is a plus for building systems that augment LLMs with external knowledge.
    ML & Programming Skills:
    - Proficiency in Python and common data science libraries (pandas, NumPy, scikit-learn).
    - Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
    - Ability to write efficient, clean code and debug complex model issues.

    Data & Analytics:
    - Solid understanding of data analytics and statistics.
    - Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
    - Experience on how to build a representative benchmarking framework given business requirements for LLM.
    - Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.

    Deployment & Tools:
    - Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
    - Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
    - Experience with cloud platforms (AWS, GCP or Azure) and big data technologies (Spark, Hadoop) for scaling data processing or model training is a plus.
    - Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
    Leadership & Communication:
    - Demonstrated ability to lead technical projects and mentor junior team members.
    - Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.

     

    Responsibilities:
    - Lead end-to-end development of NLP and LLM models - from data exploration and model prototyping to validation and production deployment. This includes designing novel model architectures or fine-tuning state-of-the-art transformer models (e.g., BERT, GPT) to solve project-specific language tasks.
    - Analyze large text datasets (Ukrainian and multilingual corpora) to extract insights and build robust training datasets.
    - Guide data collection and annotation efforts to ensure high-quality data for model training.
    - Develop and implement NLP algorithms for a range of tasks such as text classification, named entity recognition, semantic search, and conversational AI.
    - Stay up-to-date with the latest research to apply transformer-based models, embeddings, and other modern NLP techniques in the solutions.
    - Establish evaluation metrics and validation frameworks for model performance, including accuracy, factuality, and bias.
    - Design A/B tests and statistical experiments to compare model variants and validate improvements.
    - Deploy and integrate NLP models into production systems in collaboration with engineers - ensuring models are scalable, efficient, and well-monitored in a real-world setting.
    - Optimize model inference and troubleshoot issues such as model drift or data pipeline bottlenecks.
    - Provide technical leadership and mentorship to the NLP/ML team.
    - Review code and research, uphold best practices in ML (version control, reproducibility, documentation), and foster a culture of continuous learning and innovation.
    - Collaborate cross-functionally with product managers, software engineers, and MLOps engineers to align NLP solutions with product goals and infrastructure capabilities.
    - Communicate complex data science concepts to stakeholders and incorporate their feedback into model development.

     

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 93 views Β· 27 applications Β· 25d

    AI / ML Data Scientist

    Full Remote Β· Worldwide Β· Product Β· 3 years of experience Β· English - C1
    Role Overview Nucleus AI is seeking an AI / ML Data Scientist to help build and scale an AI-backed Learning Management System (LMS) that delivers personalized, adaptive learning experiences. This is a hands-on, product-focused role at the intersection of...

    Role Overview

    Nucleus AI is seeking an AI / ML Data Scientist to help build and scale an AI-backed Learning Management System (LMS) that delivers personalized, adaptive learning experiences.

    This is a hands-on, product-focused role at the intersection of machine learning, data science, and learning technology. You will design and deploy intelligent systems that directly impact learner engagement, personalization, assessment, and outcomes.

    If you enjoy turning real-world data into models that are used by real usersβ€”and iterating on them in productionβ€”this role is for you.

     

    Key Responsibilities

    Machine Learning & Modeling

    • Design, develop, and deploy machine learning models that power:

       
      • Personalized learning paths

         
      • Content and course recommendations

         
      • Learner analytics and insights

         
      • Adaptive assessments and feedback

         
    • Build models for:

       
      • Learner skill inference and knowledge tracing

         
      • Engagement, completion, and drop-off prediction

         
      • Automated assessment scoring and feedback

         

    Data & Analytics

    • Analyze large-scale learner behavior data to extract actionable insights

       
    • Develop and maintain data pipelines, feature engineering workflows, and model evaluation frameworks

       
    • Apply statistical analysis and experimentation (including A/B testing) to validate model performance and impact

       

    Collaboration & Product Integration

    • Work closely with product managers, engineers, and instructional designers to translate learning objectives into AI-driven solutions

       
    • Integrate models into production systems via APIs, batch pipelines, or real-time inference

       

    LLMs & Advanced Techniques

    • Experiment with and integrate LLMs and NLP techniques for:

       
      • Content generation

         
      • Learner feedback

         
      • Intelligent learner support

         
    • Monitor models in production for performance, bias, and drift, and continuously improve them

       

    Documentation & Governance

    • Document models, assumptions, experiments, and results to ensure transparency, reproducibility, and maintainability

       

    Required Qualifications

    • Bachelor’s or Master’s degree in Data Science, Computer Science, AI/ML, Statistics, or a related field

       
    • Strong proficiency in Python and common ML libraries (e.g., scikit-learn, PyTorch)

       
    • Solid understanding of:

       
      • Supervised and unsupervised learning

         
      • Feature engineering and model evaluation

         
      • Statistical analysis and experimentation

         
    • Experience working with both structured and unstructured data

       
    • Proficiency in SQL and working with large datasets

       
    • Ability to clearly communicate complex technical concepts to non-technical stakeholders

       

    Nice-to-Have Qualifications

    • Experience building ML systems in ed-tech, LMS platforms, or learning analytics

       
    • Familiarity with:

       
      • Large Language Models (LLMs)

         
      • NLP

         
      • Recommendation systems

         
      • Knowledge graphs

         
    • Experience deploying models to production environments

       
    • Exposure to cloud platforms such as AWS, GCP, or Azure

       
    • Understanding of learning science, instructional design, or assessment theory

       
    • Experience with MLOps tools (model versioning, monitoring, CI/CD for ML)

       

    What We Offer

    • Opportunity to work on mission-driven AI that improves how people learn

       
    • Ownership of ML systems used by real learners at scale

       
    • A collaborative, cross-functional team culture

       
    • Competitive compensation and benefits

       
    • Flexible work location and schedule

       
    • Continuous learning and professional growth opportunities

     

    More
Log In or Sign Up to see all posted jobs