Data Science UA
Data Science UA is a service company with strong data science and AI expertise. We have been developing AI solutions for all kinds and sizes of businesses. Our expertise includes AI software development, computer vision (image recognition), NLP, machine learning, big data, data analytics, data mining, and data visualization.
Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe, boasting a network of over 30,000 AI top engineers. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv.
We offer diverse cooperation models, including outsourcing and outstaffing, where we assemble top-notch tech teams of industry experts to craft optimal solutions tailored to your business requirements.
At Data Science UA, our core focus revolves around AI consulting. Whether you possess a clearly defined request or just a nascent idea, we are eager to collaborate and explore possibilities together.
Additionally, our comprehensive recruiting service extends beyond AI and data science specialists, allowing us to find the best candidate on your request for each level and specialization, bolstering teams worldwide. Together, we can achieve extraordinary milestones for your enterprise. Reach out to us today, and let's take on this transformative journey hand in hand!
-
Β· 30 views Β· 1 application Β· 11d
Data Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - NoneAbout us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.
You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.
Requirements:
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.
Nice to have:
- Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
- Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
- CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
- Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
- Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimizing existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve the workflows.
Responsibilities:
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
- Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability.
- Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
- Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
- Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance.
- Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 50 views Β· 1 application Β· 22d
Data Engineer (NLP-Focused)
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.
You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.
Requirements:
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.
Responsibilities:
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
- Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability.
- Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
- Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
- Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance.
- Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 18 views Β· 1 application Β· 6d
Senior Analytics Engineer
Full Remote Β· Countries of Europe or Ukraine Β· 4 years of experience Β· English - NoneData Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
It is a new generation of data service provider, specializing in data consulting and data-driven digital marketing, dedicated to transforming data into business impact across the entire value chain of organizations. The broad range of data-driven solutions in data consulting and digital marketing are designed to meet clientsβ specific needs, always conceived with a business-centric approach and delivered with tangible results.
About the role:
We are seeking a highly skilled Senior Analytics Engineer to drive the end-to-end lifecycle of data products. You will act as the technical bridge between raw data sources and advanced data science, utilizing Microsoft Azure technologies to drive data discovery, innovation, and the deployment of scalable analytical solutions. You will play a pivotal role in implementing Data Fabric architectures to "stitch" together disparate data sourcesβcreating a unified, accessible view of the consumer without the bottlenecks of traditional rigid silos.
More
Key Responsibilities:
- Azure Cloud Architecture & Engineering:
Design and maintain scalable data pipelines and analytical environments using the Azure stack (e.g., Azure Synapse, Data Factory, Databricks, Azure SQL). Ensure optimal performance and cost-efficiency of cloud resources.
- Data Fabric Implementation:
Champion the adoption of Data Fabric principles to connect data across on-premise and multi-cloud environments. Implement logical data layers and virtualization techniques to provide seamless data access without unnecessary data movement.
- Data Integration & Modeling:
Design robust, modular data models. Integrate fragmented data sources across multiple markets to create a "single source of truth" while utilizing active metadata management to automate data delivery.
- Data Strategy & Innovation:
Drive data discovery initiatives to identify new value drivers. Architect innovative data products that solve complex business challenges using modern data mesh or fabric methodologies.
- Data Science Collaboration:
Prepare high-quality, feature-rich datasets. Partner closely with Data Scientists to explain data nuances and lineage, ensuring the data foundation supports advanced modeling and machine learning.
- Domain Expertise:
Apply deep knowledge of consumer data domainsβspecifically Loyalty, Promotions, Subscriptions, and E-commerceβto ensure analytical solutions are business-relevant and actionable.
Required Qualifications:
- Technical Proficiency:
Expert-level proficiency in SQL and Python is required.
- Azure Expertise:
Proven experience in cloud data warehousing and engineering within the Microsoft Azure ecosystem (Synapse Analytics, Azure Data Lake Gen2, Azure Data Factory).
- Data Fabric & Virtualization:
Strong understanding of Data Fabric concepts, including data virtualization, active metadata management, and knowledge graphs (experience with tools like Microsoft Purview or logical data warehousing is a plus).
- Data Modeling:
Advanced experience in dimensional modeling and building performant data marts.
- Consumer Data Experience:
Strong experience working with consumer-centric data (Loyalty, CRM, E-commerce).
- Communication:
Ability to translate complex technical data concepts for non-technical stakeholders and Data Scientists.
We offer:
- Free English classes with a native speaker and external courses compensation.
- PE support by professional accountants.
- 40 days of PTO.
- Medical insurance.
- Team-building events, conferences, meetups, and other activities.
- There are many other benefits youβll find out at the interview. -
Β· 33 views Β· 2 applications Β· 8d
Data Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - NoneAbout us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.
You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.
Requirements:
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.
Nice to have:
- Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
- Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
- CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
- Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
- Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimizing existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve the workflows.
Responsibilities:
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
- Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability.
- Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
- Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
- Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance.
- Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 23 views Β· 2 applications Β· 8d
Senior Data Engineer
Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 5 years of experience Β· English - NoneAbout us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 AI top engineers.
About the client:
We are working with a new generation of data service provider, specializing in data consulting and data-driven digital marketing, dedicated to transforming data into business impact across the entire value chain of organizations. The companyβs data-driven services are built upon the deep AI expertise the companyβs acquired with a 1000+ client base around the globe. The company has 1000 employees across 20 offices who are focused on accelerating digital transformation.
About the role:
We are seeking a Senior Data Engineer (Azure) to design and maintain data pipelines and systems for analytics and AI-driven applications. You will work on building reliable ETL/ELT workflows and ensuring data integrity across the organization.
Required skills:
- 6+ years of experience as a Data Engineer, preferably in Azure environments.
- Proficiency in Python, SQL, NoSQL, and Cypher for data manipulation and querying.
- Hands-on experience with Airflow and Azure Data Services for pipeline orchestration.
- Strong understanding of data modeling, ETL/ELT workflows, and data warehousing concepts.
- Experience in implementing DataOps practices for pipeline automation and monitoring.
- Knowledge of data governance, data security, and metadata management principles.
- Ability to work collaboratively with data science and analytics teams.
- Excellent problem-solving and communication skills.
Responsibilities:
- Transform data into formats suitable for analysis by developing and maintaining processes for data transformation;
- Structuring, metadata management, and workload management.
- Design, implement, and maintain scalable data pipelines on Azure.
- Develop and optimize ETL/ELT processes for various data sources.
- Collaborate with data scientists and analysts to ensure data readiness.
- Monitor and improve data quality, performance, and governance. -
Β· 20 views Β· 2 applications Β· 6d
Senior/Middle Data Scientist (Benchmarking/Alignment)
Hybrid Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - NoneData Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
More
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will design and implement a state-of-the-art evaluation and benchmarking framework to measure and guide model quality, and personally train LLMs with a strong focus on Reinforcement Learning from Human Feedback (RLHF). You will work alongside top AI researchers and engineers, ensuring the models are not only powerful but also aligned with user needs, cultural context, and ethical standards.
Requirements:
Education & Experience:
- 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
- Proven experience in machine learning model evaluation and/or NLP benchmarking.
- Advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Good knowledge of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
- Familiarity with LLM training and fine-tuning techniques.
ML & Programming Skills:
- Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Solid understanding of RLHF concepts and related techniques (preference modeling, reward modeling, reinforcement learning).
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience creating and managing test datasets, including annotation and labeling processes.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
Communication:
- Experience working in a collaborative, cross-functional environment.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
Nice to have:
Advanced NLP/ML Techniques:
- Prior work on LLM safety, fairness, and bias mitigation.
- Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Knowledge of data annotation workflows and human feedback collection methods.
Research & Community:
- Publications in NLP/ML conferences or contributions to open-source NLP projects.
- Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
Domain & Language Knowledge:
- Familiarity with the Ukrainian language and context.
- Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
- Knowledge of Ukrainian benchmarks, or familiarity with other evaluation datasets and leaderboards for large models, can be an advantage given the projectβs focus.
MLOps & Infrastructure:
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
- Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
Problem-Solving:
- Innovative mindset with the ability to approach open-ended AI problems creatively.
- Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.
Responsibilities:
- Analyze benchmarking datasets, define gaps, and design, implement, and maintain a comprehensive benchmarking framework for the Ukrainian language.
- Research and integrate state-of-the-art evaluation metrics for factual accuracy, reasoning, language fluency, safety, and alignment.
- Design and maintain testing frameworks to detect hallucinations, biases, and other failure modes in LLM outputs.
- Develop pipelines for synthetic data generation and adversarial example creation to challenge the modelβs robustness.
- Collaborate with human annotators, linguists, and domain experts to define evaluation tasks and collect high-quality feedback
- Develop tools and processes for continuous evaluation during model pre-training, fine-tuning, and deployment.
- Research and develop best practices and novel techniques in LLM training pipelines.
- Analyze benchmarking results to identify model strengths, weaknesses, and improvement opportunities.
- Work closely with other data scientists to align training and evaluation pipelines.
- Document methodologies and share insights with internal teams.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 17 views Β· 0 applications Β· 6d
Senior/Middle Data Scientist (Data Preparation/Pre-training)
Hybrid Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - NoneData Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will focus on designing and prototyping data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, and actively developing model training pipelines with other talented data scientists. Your work will directly shape the quality and capabilities of the models by ensuring we feed them the highest-quality, most relevant data possible.
Requirements:
Education & Experience:
- 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
- Proven experience in data preprocessing, cleaning, and feature engineering for large-scale datasets of unstructured data (text, code, documents, etc.).
- Advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Good knowledge of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
- Familiarity with LLM training and fine-tuning techniques.
ML & Programming Skills:
- Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
Communication & Personality:
- Experience working in a collaborative, cross-functional environment.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
- Ability to rapidly prototype and iterate on ideas
Nice to have:
Advanced NLP/ML Techniques:
- Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Understanding of FineWeb2 or similar processing pipelines approach.
Research & Community:
- Publications in NLP/ML conferences or contributions to open-source NLP projects.
- Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
Domain & Language Knowledge:
- Familiarity with the Ukrainian language and context.
- Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
- Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
MLOps & Infrastructure:
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
- Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
Problem-Solving:
- Innovative mindset with the ability to approach open-ended AI problems creatively.
- Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.
Responsibilities:
- Design, prototype, and validate data preparation and transformation steps for LLM training datasets, including cleaning and normalization of text, filtering of toxic content, de-duplication, de-noising, detection and deletion of personal data, etc.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Analyze large-scale raw text, code, and multimodal data sources for quality, coverage, and relevance.
- Develop heuristics, filtering rules, and cleaning techniques to maximize training data effectiveness.
- Collaborate with data engineers to hand over prototypes for automation and scaling.
- Research and develop best practices and novel techniques in LLM training pipelines.
- Monitor and evaluate data quality impact on model performance through experiments and benchmarks.
- Research and implement best practices in large-scale dataset creation for AI/ML models.
- Document methodologies and share insights with internal teams.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 60 views Β· 3 applications Β· 22d
Senior/Middle Data Scientist
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will focus on designing and prototyping data preparation pipelines, collaborating closely with data engineers to transform your prototypes into scalable production pipelines, and actively developing model training pipelines with other talented data scientists. Your work will directly shape the quality and capabilities of the models by ensuring we feed them the highest-quality, most relevant data possible.
Requirements:
Education & Experience:
- 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
- Proven experience in data preprocessing, cleaning, and feature engineering for large-scale datasets of unstructured data (text, code, documents, etc.).
- Advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Good knowledge of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
- Familiarity with LLM training and fine-tuning techniques.
ML & Programming Skills:
- Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
Communication & Personality:
- Experience working in a collaborative, cross-functional environment.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.
- Ability to rapidly prototype and iterate on ideas
Nice to have:
Advanced NLP/ML Techniques:
- Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Understanding of FineWeb2 or similar processing pipelines approach.
Research & Community:
- Publications in NLP/ML conferences or contributions to open-source NLP projects.
- Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
Domain & Language Knowledge:
- Familiarity with the Ukrainian language and context.
- Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
- Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
MLOps & Infrastructure:
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
- Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
Problem-Solving:
- Innovative mindset with the ability to approach open-ended AI problems creatively.
- Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.
Responsibilities:
- Design, prototype, and validate data preparation and transformation steps for LLM training datasets, including cleaning and normalization of text, filtering of toxic content, de-duplication, de-noising, detection and deletion of personal data, etc.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Analyze large-scale raw text, code, and multimodal data sources for quality, coverage, and relevance.
- Develop heuristics, filtering rules, and cleaning techniques to maximize training data effectiveness.
- Collaborate with data engineers to hand over prototypes for automation and scaling.
- Research and develop best practices and novel techniques in LLM training pipelines.
- Monitor and evaluate data quality impact on model performance through experiments and benchmarks.
- Research and implement best practices in large-scale dataset creation for AI/ML models.
- Document methodologies and share insights with internal teams.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 19 views Β· 1 application Β· 22d
Senior Data Scientist/NLP Lead
Office Work Β· Ukraine (Kyiv) Β· Product Β· 5 years of experience Β· English - B2About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for an experienced Senior Data Scientist / NLP Lead to spearhead the development of cutting-edge natural language processing solutions for the Ukrainian LLM project. You will lead the NLP team in designing, implementing, and deploying large-scale language models and NLP algorithms that power the products.This role is critical to the mission of advancing AI in the Ukrainian language context, and offers the opportunity to drive technical decisions, mentor a team of data scientists, and shape the future of AI capabilities in Ukraine.
Requirements:
Education & Experience:
- 5+ years of experience in data science or machine learning, with a strong focus on NLP.
- Proven track record of developing and deploying NLP or ML models at scale in production environments.
- An advanced degree (Masterβs or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
NLP Expertise:
- Deep understanding of natural language processing techniques and algorithms.
- Hands-on experience with modern NLP approaches, including embedding models, text classification, sequence tagging (NER), and transformers/LLMs.
- Deep understanding of transformer architectures and knowledge of LLM training and fine-tuning techniques, hands-on experience developing solutions on LLM, and knowledge of linguistic nuances in Ukrainian or other languages.
Advanced NLP/ML Techniques:
- Experience with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
- Background in information retrieval or RAG (Retrieval-Augmented Generation) is a plus for building systems that augment LLMs with external knowledge.
ML & Programming Skills:
- Proficiency in Python and common data science libraries (pandas, NumPy, scikit-learn).
- Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
- Ability to write efficient, clean code and debug complex model issues.
Data & Analytics:
- Solid understanding of data analytics and statistics.
- Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
- Experience on how to build a representative benchmarking framework given business requirements for LLM.
- Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
Deployment & Tools:
- Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
- Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
- Experience with cloud platforms (AWS, GCP or Azure) and big data technologies (Spark, Hadoop) for scaling data processing or model training is a plus.
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
Leadership & Communication:
- Demonstrated ability to lead technical projects and mentor junior team members.
- Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.Responsibilities:
- Lead end-to-end development of NLP and LLM models - from data exploration and model prototyping to validation and production deployment. This includes designing novel model architectures or fine-tuning state-of-the-art transformer models (e.g., BERT, GPT) to solve project-specific language tasks.
- Analyze large text datasets (Ukrainian and multilingual corpora) to extract insights and build robust training datasets.
- Guide data collection and annotation efforts to ensure high-quality data for model training.
- Develop and implement NLP algorithms for a range of tasks such as text classification, named entity recognition, semantic search, and conversational AI.
- Stay up-to-date with the latest research to apply transformer-based models, embeddings, and other modern NLP techniques in the solutions.
- Establish evaluation metrics and validation frameworks for model performance, including accuracy, factuality, and bias.
- Design A/B tests and statistical experiments to compare model variants and validate improvements.
- Deploy and integrate NLP models into production systems in collaboration with engineers - ensuring models are scalable, efficient, and well-monitored in a real-world setting.
- Optimize model inference and troubleshoot issues such as model drift or data pipeline bottlenecks.
- Provide technical leadership and mentorship to the NLP/ML team.
- Review code and research, uphold best practices in ML (version control, reproducibility, documentation), and foster a culture of continuous learning and innovation.
- Collaborate cross-functionally with product managers, software engineers, and MLOps engineers to align NLP solutions with product goals and infrastructure capabilities.
- Communicate complex data science concepts to stakeholders and incorporate their feedback into model development.The company offers:
More
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 33 views Β· 3 applications Β· 29d
Senior Data Engineer
Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 5 years of experience Β· English - B2About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 AI top engineers.
About the client:
We are working with a new generation of data service provider, specializing in data consulting and data-driven digital marketing, dedicated to transforming data into business impact across the entire value chain of organizations. The companyβs data-driven services are built upon the deep AI expertise the companyβs acquired with a 1000+ client base around the globe. The company has 1000 employees across 20 offices who are focused on accelerating digital transformation.
About the role:
We are seeking a Senior Data Engineer (Azure) to design and maintain data pipelines and systems for analytics and AI-driven applications. You will work on building reliable ETL/ELT workflows and ensuring data integrity across the organization.
Required skills:
- 6+ years of experience as a Data Engineer, preferably in Azure environments.
- Proficiency in Python, SQL, NoSQL, and Cypher for data manipulation and querying.
- Hands-on experience with Airflow and Azure Data Services for pipeline orchestration.
- Strong understanding of data modeling, ETL/ELT workflows, and data warehousing concepts.
- Experience in implementing DataOps practices for pipeline automation and monitoring.
- Knowledge of data governance, data security, and metadata management principles.
- Ability to work collaboratively with data science and analytics teams.
- Excellent problem-solving and communication skills.
Responsibilities:
- Transform data into formats suitable for analysis by developing and maintaining processes for data transformation;
- Structuring, metadata management, and workload management.
- Design, implement, and maintain scalable data pipelines on Azure.
- Develop and optimize ETL/ELT processes for various data sources.
- Collaborate with data scientists and analysts to ensure data readiness.
- Monitor and improve data quality, performance, and governance. -
Β· 62 views Β· 14 applications Β· 22d
Senior DevOps Engineer
Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 7 years of experience Β· English - B2Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 top AI engineers.
More
About the client:
Our client specializes in inventory optimization for industrial and manufacturing companies, helping them streamline inventory and optimize working capital. Their browser-based platform, hosted on Azure, is designed to manage and refine inventory processes, enabling companies to buy only what they need and proactively manage stock.
About role:
Weβre looking for a seasoned DevOps expert who will take full ownership of infrastructure and reliability topics - designing, building, and maintaining a scalable and secure environment for a mission-critical healthcare platform.
Requirements:
- 8+ years in production operations / DevOps roles;
- Strong experience with AWS, Kubernetes, Docker, Terraform;
- Advanced Linux system administration and network troubleshooting;
- Experience with CI/CD (preferably GitLab CI/CD);
- Monitoring stack experience (Prometheus, Grafana, ELK, or similar);
- Nice to have: PHP ecosystem familiarity + advanced MySQL administration experience.
Key responsibilities:
- Own and manage DevOps and infrastructure topics end-to-end;
- Monitor, maintain, and improve uptime, reliability, performance, and user experience;
- Design and track SLOs/SLIs suitable for healthcare-grade reliability;
- Manage CI/CD pipelines (GitLab), deployments, and rollbacks;
- Implement and enforce IaC best practices (Terraform, Ansible or similar);
- Build monitoring, alerting, and observability solutions with proactive incident detection;
- Troubleshoot production issues across AWS, Linux, MySQL, networking, and web servers (Apache/Nginx);
- Ensure platform security and GDPR compliance;
- Plan capacity and scalability as the platform grows;
- Lead post-incident reviews and drive continuous reliability improvements;
- Collaborate closely with the CTO and engineering team on architecture decisions;
- Participate in on-call rotations during business hours (no nights/weekends).
The company offers:
- Opportunity to work on a cutting-edge localization platform with AI-driven innovation;
- A collaborative, dynamic team environment with a culture of learning and growth;
- Competitive salary and flexible work arrangements -
Β· 123 views Β· 32 applications Β· 6d
Machine Learning Engineer
Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 3 years of experience Β· English - B1Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
More
About the role and client:
Weβre looking for a Machine Learning Engineer to develop applied AI and agentic tools.
Our client is building an advanced AI assistant that helps teams work smarter. By combining semantic search, agentic AI, and state-of-the-art language models, the system enhances internal operations through intelligent, context-aware support and personalized interactions.
Requirements:
- Strong proficiency in Python;
- Experience with agentic AI frameworks such as LangChain or LangGraph;
- Solid understanding of machine learning and NLP fundamentals;
- Hands-on experience with ML frameworks (e.g., PyTorch, TensorFlow);
- Familiarity with prototyping tools such as Streamlit or Gradio;
- Knowledge of engineering best practices: Git, Docker, cloud basics, and task estimation;
- Practical knowledge of SQL;
- Upper-Intermediate level of English (both written and spoken).
Would be a plus:
- Experience with other agentic or LLM orchestration tools;
- Experience with MLOps or model deployment;
- Comfortable working in Linux terminal environments.
Responsibilities:
- Develop and integrate ML and NLP models to power intelligent assistant features;
- Build agentic workflows using LangChain, LangGraph, or similar frameworks;
- Prototype user interfaces and internal tools using Streamlit or Gradio;
- Collaborate with the engineering and product teams to plan and deliver ML-driven features;
- Work with Docker to manage development and runtime environments;
- Use Git for version control and write clean, maintainable code;
- Query structured data using SQL;
- Contribute to model deployment and operations in a cloud environment (primarily Azure). -
Β· 22 views Β· 1 application Β· 22d
Senior Data Engineer
Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 5 years of experience Β· English - B2About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 AI top engineers.
About the client:
We are working with a new generation of data service provider, specializing in data consulting and data-driven digital marketing, dedicated to transforming data into business impact across the entire value chain of organizations. The companyβs data-driven services are built upon the deep AI expertise the companyβs acquired with a 1000+ client base around the globe. The company has 1000 employees across 20 offices who are focused on accelerating digital transformation.
About the role:
We are seeking a Senior Data Engineer (Azure) to design and maintain data pipelines and systems for analytics and AI-driven applications. You will work on building reliable ETL/ELT workflows and ensuring data integrity across the organization.
Required skills:
- 6+ years of experience as a Data Engineer, preferably in Azure environments.
- Proficiency in Python, SQL, NoSQL, and Cypher for data manipulation and querying.
- Hands-on experience with Airflow and Azure Data Services for pipeline orchestration.
- Strong understanding of data modeling, ETL/ELT workflows, and data warehousing concepts.
- Experience in implementing DataOps practices for pipeline automation and monitoring.
- Knowledge of data governance, data security, and metadata management principles.
- Ability to work collaboratively with data science and analytics teams.
- Excellent problem-solving and communication skills.
Responsibilities:
- Transform data into formats suitable for analysis by developing and maintaining processes for data transformation;
- Structuring, metadata management, and workload management.
- Design, implement, and maintain scalable data pipelines on Azure.
- Develop and optimize ETL/ELT processes for various data sources.
- Collaborate with data scientists and analysts to ensure data readiness.
- Monitor and improve data quality, performance, and governance. -
Β· 145 views Β· 33 applications Β· 19d
AI Project Manager
Full Remote Β· Countries of Europe or Ukraine Β· 1 year of experience Β· English - B2Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the role:
Weβre looking for a Project Manager to lead AI projects (Computer Vision / ML) from kick-off to delivery - planning milestones, managing risks, and keeping clients aligned along the way. Youβll work closely with engineers and stakeholders, run clear communication and reporting, and help improve our delivery processes.
Responsibilities:
- Manage AI/ML/CV projects end-to-end across different industries (planning, milestones,
scope, change control, acceptance).
- Own delivery governance project charter, status reporting, RAID (risks/assumptions/issues/dependencies), stakeholder communication, escalation.
- Planning & scoping requirements clarification, estimation support with tech leads, WBS, delivery plan, release/milestone schedule.
- People management team coordination, workload planning, 1:1s, feedback loops, removal of blockers.
- Risk & quality management risk register, mitigation plans, delivery health checks, lessons learned.
- Support presales (optional share): discovery calls, high-level planning, SoW inputs, timelines, assumptions, and delivery approach.
Must-have requirements:
- 1+ year experience in project management / delivery coordination (tech/IT/consulting preferred).
- Understanding of CRISP-DM, SDLC and delivery methodologies: Agile, Predictive (Waterfall).
- End-to-end project ownership experience (from kickoff to delivery/closure).
- Experience with planning, scoping, estimation coordination, and risk management.
- People management basics - team coordination + conducting 1:1s
- Strong communication and documentation skills (status updates, action items, meeting facilitation).
- English B2βC1, you can confidently run calls, write updates, and explain decisions.
- Certification: Google Project Management (or equivalent) required.Nice-to-have (optional):
More
- CAPM, ICP-APM (or any PMI/ICAgile certification track).
- Presales experience (discovery, scoping, SoW contribution, client presentations).
- Budget & margin management (fixed-price economics, forecasting, tracking costs vs earned value).
- Experience in AI/ML/CV domains (even if non-technical, you understand lifecycle + typical risks).
We offer:
- Free English classes with a native speaker and external courses compensation.
- PE support by professional accountants.
- 40 days of PTO.
- Medical insurance.
- Team-building events, conferences, meetups, and other activities.
- There are many other benefits youβll find out at the interview. -
Β· 27 views Β· 8 applications Β· 18d
Middle Machine Learning Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B2Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one...Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the Role:
Weβre looking for a mid-level AI Engineer to help test, deploy, and integrate cutting-edge generative AI models into production experiences centered around human avatars and 3D content. Youβll work directly with the CEO to turn R&D prototypes into stable, scalable products.
Responsibilities:
- Experiment with and evaluate generative models for:- Human avatar creation and animation;
- 3D reconstruction and modeling;
- Gaussian splattingβbased pipelines;
- Generalized NeRF (Neural Radiance Fields) techniques.
- Turn research code and models into production-ready services (APIs, microservices, or batch pipelines).
- Build and maintain Python-based tooling for data preprocessing, training, evaluation, and inference.
- Design and optimize cloud-based deployment workflows (e.g., containers, GPUs, inference endpoints, job queues).
- Integrate models into user-facing applications in collaboration with product, design, and frontend teams.
- Monitor model performance, reliability, and cost; propose and implement improvements.
- Stay up-to-date on relevant research and help prioritize which techniques to test and adopt.
Required Qualifications:
- 3β5+ years of experience as an ML/AI Engineer or similar role.
- Strong Python skills and experience with one or more deep learning frameworks (PyTorch preferred).
- Hands-on experience with deploying ML models to cloud environments (AWS, GCP, Azure, or similar) including containers (Docker) and basic CI/CD workflows.
- Familiarity with 3D data formats and pipelines (meshes, point clouds, volumetric representations, etc.).
- Practical exposure to one or more of the following (professional or serious personal projects):- NeRFs or NeRF-like methods;
- Gaussian splatting / 3D Gaussian fields;
- Avatar generation / face-body reconstruction / pose estimation;
- Comfort working in an iterative, fast-paced environment directly with leadership (reporting to CEO).
More
Nice-to-Haves:
- Experience with real-time rendering pipelines (e.g., Unity, Unreal, WebGL) or GPU programming (CUDA).
- Experience optimizing inference performance and cost (model distillation, quantization, batching)
- Background in computer vision, graphics, or related fields (academic or industry).
- 1
- 2