Data Science UA

Joined in 2019

Data Science UA is a service company with strong data science and AI expertise. We have been developing AI solutions for all kinds and sizes of businesses. Our expertise includes AI software development, computer vision (image recognition), NLP, machine learning, big data, data analytics, data mining, and data visualization. 
Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe, boasting a network of over 30,000 AI top engineers. Our journey began in 2016 with the uniting top AI talents and organizing the first Data Science tech conference in Kyiv. 
We offer diverse cooperation models, including outsourcing and outstaffing, where we assemble top-notch tech teams of industry experts to craft optimal solutions tailored to your business requirements. 
At Data Science UA, our core focus revolves around AI consulting. Whether you possess a clearly defined request or just a nascent idea, we are eager to collaborate and explore possibilities together. 
Additionally, our comprehensive recruiting service extends beyond AI and data science specialists, allowing us to find the best candidate on your request for each level and specialization, bolstering teams worldwide. Together, we can achieve extraordinary milestones for your enterprise. Reach out to us today, and let's take on this transformative journey hand in hand!

  • Β· 34 views Β· 2 applications Β· 3d

    Computer Vision/Machine Learning Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the role:
    We are looking for a Computer Vision / Machine Learning Engineer to develop offline CV models for industrial visual inspection.


    Your main task will be to design, train, and evaluate models on inspection data in order to:

     

    • Improve discrimination between good vs. defect samples
    • Provide insights into key defect categories (e.g., terminal electrode irregularities, surface chipping)
    • Significantly reduce false-positive rates, optimizing for either precision, or recall
    • Prepare the solution for future deployment, scaling, and maintenance
    •  

    Key Responsibilities:
    Data Analysis & Preparation
    - Conduct dataset audits, including class balance checks and sample quality reviews
    - Identify low-frequency defect classes and outliers
    - Design and implement augmentation strategies for rare defects and edge cases
    Model Development & Evaluation
    - Train deep-learning models on inspection images for defect detection
    - Use modern computer vision / deep learning frameworks (e.g., PyTorch, TensorFlow)
    - Evaluate models using confusion matrices, ROC curves, precision–recall curves, F1 scores and other relevant metrics
    - Analyze false positives/false negatives and propose thresholds or model improvements
    Reporting & Communication
    - Prepare clear offline performance reports and model evaluation summaries
    - Explain classifier decisions, limitations, and reliability in simple, non-technical language when needed
    - Provide recommendations for scalable deployment in later phases (e.g., edge / on-prem inference, integration patterns)

    Candidate Requirements:
    Must-have:
    - 1-2 years of hands-on experience with computer vision and deep learning (classification, detection, or segmentation)
    - Strong proficiency in Python and at least one major DL framework (PyTorch or TensorFlow/Keras)
    - Solid understanding of:

    • Image preprocessing and augmentation techniques
    • Classification metrics: accuracy, precision, recall, F1, confusion matrix, ROC, PR curves
    • Handling imbalanced datasets and low-frequency classes

    - Experience training and evaluating offline models on real production or near-production datasets
    - Ability to structure and document experiments, compare baselines, and justify design decisions
    - Strong analytical and problem-solving skills; attention to detail in data quality and labelling
    - Good communication skills in English (written and spoken) to interact with internal and client stakeholders

    Nice-to-have:
    - Experience with industrial / manufacturing computer vision (AOI, quality inspection, defect detection, etc.)
    - Familiarity with ML Ops/deployment concepts (ONNX, TensorRT, Docker, REST APIs, edge devices)
    - Experience working with time-critical or high-throughput inspection systems
    - Background in electronics, semiconductors, or similar domains is an advantage
    - Experience preparing client-facing reports and presenting technical results to non-ML audiences

    We offer:
    - Free English classes with a native speaker and external courses compensation;
    - PE support by professional accountants;
    - 40 days of PTO;
    - Medical insurance;
    - Team-building events, conferences, meetups, and other activities;
    - There are many other benefits you’ll find out at the interview.

    More
  • Β· 35 views Β· 12 applications Β· 4d

    Talent Sourcer

    Part-time Β· Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe.

    About role:
    We’re looking for a Talent Researcher who treats sourcing as both an art form and a strategic challenge β€” someone ready to join the mission of finding and engaging top talent.

    Responsibilities:
    - Hunting top talent using every sourcing method you know β€” and some you’ll invent
    - Partnering with recruiters and hiring managers to design bold, data-driven search strategies
    - Building and maintaining pipelines of high-potential candidates across global markets
    - Tracking sourcing metrics and transforming data into actionable insights
    - Experimenting with tools and automations to push sourcing efficiency to the next level
    - Crafting outreach that candidates can’t ignore β€” turning cold messages into real conversations

    Requirements:
    - 1 year of sourcing or recruiting experience
    - Strong research skills and the ability to uncover hidden talent
    - Analytical mindset and love for metrics, dashboards, and reports
    - Copywriting skills that make candidates want to answer you
    - Curiosity, creativity, and a results-driven approach

    The company offers:
    - Opportunity to work on a cutting-edge localization platform with AI-driven innovation.
    - A collaborative, dynamic team environment with a culture of learning and growth.
    - Competitive salary and flexible work arrangements.

    More
  • Β· 43 views Β· 13 applications Β· 4d

    IT Recruiter

    Part-time Β· Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 2 years of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe.

    About the role:
    We are looking for an IT Recruiter, who will join our team. You would be a part of the core recruitment team, that drives the staffing process.

    Requirements:
    - 2+ years of working experience in IT Recruitment;
    - Familiar with modern recruitment tools, techniques and best practices;
    - Good spoken and written English;
    - Experience in closing Data science related positions;
    - Strong teamwork (it’s very essential for us!);
    - Analytical mind, be initiative and independent;
    - Effective communication and negotiation skills;
    - Strong desire to work in IT environment and grow as a Recruitment Specialist (we have no HR functions, and it wouldn’t be).

    Responsibilities:
    - Drive full cycle recruitment process and successful closing of our vacancies;
    - Use creative and innovative sourcing approaches to identify, source and engage candidates;
    - Pre-screen and interview candidates to define skills, knowledge, and experience according to position requirements;
    - Build relationships with customers as you'll be directly involved in communication with them;
    - Compose, monitor and maintain job postings on various job boards;
    - Maintain and update the candidate database;
    - Provide statistic reports on a weekly and monthly basis.

    We offer:
    - Free English classes with a native speaker and external courses compensation;
    - PE support by professional accountants;
    - Medical insurance;
    - Team-building events, conferences, meetups, and other activities;
    - There are many other benefits you’ll find out at the interview.

    More
  • Β· 55 views Β· 3 applications Β· 5d

    Computer Vision/Machine Learning Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the role:
    We are looking for a Computer Vision / Machine Learning Engineer to develop offline CV models for industrial visual inspection.


    Your main task will be to design, train, and evaluate models on inspection data in order to:

     

    • Improve discrimination between good vs. defect samples
    • Provide insights into key defect categories (e.g., terminal electrode irregularities, surface chipping)
    • Significantly reduce false-positive rates, optimizing for either precision, or recall
    • Prepare the solution for future deployment, scaling, and maintenance
    •  

    Key Responsibilities:
    Data Analysis & Preparation
    - Conduct dataset audits, including class balance checks and sample quality reviews
    - Identify low-frequency defect classes and outliers
    - Design and implement augmentation strategies for rare defects and edge cases
    Model Development & Evaluation
    - Train deep-learning models on inspection images for defect detection
    - Use modern computer vision / deep learning frameworks (e.g., PyTorch, TensorFlow)
    - Evaluate models using confusion matrices, ROC curves, precision–recall curves, F1 scores and other relevant metrics
    - Analyze false positives/false negatives and propose thresholds or model improvements
    Reporting & Communication
    - Prepare clear offline performance reports and model evaluation summaries
    - Explain classifier decisions, limitations, and reliability in simple, non-technical language when needed
    - Provide recommendations for scalable deployment in later phases (e.g., edge / on-prem inference, integration patterns)

    Candidate Requirements:
    Must-have:
    - 1-2 years of hands-on experience with computer vision and deep learning (classification, detection, or segmentation)
    - Strong proficiency in Python and at least one major DL framework (PyTorch or TensorFlow/Keras)
    - Solid understanding of:

    • Image preprocessing and augmentation techniques
    • Classification metrics: accuracy, precision, recall, F1, confusion matrix, ROC, PR curves
    • Handling imbalanced datasets and low-frequency classes

    - Experience training and evaluating offline models on real production or near-production datasets
    - Ability to structure and document experiments, compare baselines, and justify design decisions
    - Strong analytical and problem-solving skills; attention to detail in data quality and labelling
    - Good communication skills in English (written and spoken) to interact with internal and client stakeholders

    Nice-to-have:
    - Experience with industrial / manufacturing computer vision (AOI, quality inspection, defect detection, etc.)
    - Familiarity with ML Ops/deployment concepts (ONNX, TensorRT, Docker, REST APIs, edge devices)
    - Experience working with time-critical or high-throughput inspection systems
    - Background in electronics, semiconductors, or similar domains is an advantage
    - Experience preparing client-facing reports and presenting technical results to non-ML audiences

    We offer:
    - Free English classes with a native speaker and external courses compensation;
    - PE support by professional accountants;
    - 40 days of PTO;
    - Medical insurance;
    - Team-building events, conferences, meetups, and other activities;
    - There are many other benefits you’ll find out at the interview.

    More
  • Β· 57 views Β· 3 applications Β· 7d

    Computer Vision/Machine Learning Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the role:
    We are looking for a Computer Vision / Machine Learning Engineer to develop offline CV models for industrial visual inspection.


    Your main task will be to design, train, and evaluate models on inspection data in order to:

     

    • Improve discrimination between good vs. defect samples
    • Provide insights into key defect categories (e.g., terminal electrode irregularities, surface chipping)
    • Significantly reduce false-positive rates, optimizing for either precision, or recall
    • Prepare the solution for future deployment, scaling, and maintenance
    •  

    Key Responsibilities:
    Data Analysis & Preparation
    - Conduct dataset audits, including class balance checks and sample quality reviews
    - Identify low-frequency defect classes and outliers
    - Design and implement augmentation strategies for rare defects and edge cases
    Model Development & Evaluation
    - Train deep-learning models on inspection images for defect detection
    - Use modern computer vision / deep learning frameworks (e.g., PyTorch, TensorFlow)
    - Evaluate models using confusion matrices, ROC curves, precision–recall curves, F1 scores and other relevant metrics
    - Analyze false positives/false negatives and propose thresholds or model improvements
    Reporting & Communication
    - Prepare clear offline performance reports and model evaluation summaries
    - Explain classifier decisions, limitations, and reliability in simple, non-technical language when needed
    - Provide recommendations for scalable deployment in later phases (e.g., edge / on-prem inference, integration patterns)

    Candidate Requirements:
    Must-have:
    - 1-2 years of hands-on experience with computer vision and deep learning (classification, detection, or segmentation)
    - Strong proficiency in Python and at least one major DL framework (PyTorch or TensorFlow/Keras)
    - Solid understanding of:

    • Image preprocessing and augmentation techniques
    • Classification metrics: accuracy, precision, recall, F1, confusion matrix, ROC, PR curves
    • Handling imbalanced datasets and low-frequency classes

    - Experience training and evaluating offline models on real production or near-production datasets
    - Ability to structure and document experiments, compare baselines, and justify design decisions
    - Strong analytical and problem-solving skills; attention to detail in data quality and labelling
    - Good communication skills in English (written and spoken) to interact with internal and client stakeholders

    Nice-to-have:
    - Experience with industrial / manufacturing computer vision (AOI, quality inspection, defect detection, etc.)
    - Familiarity with ML Ops/deployment concepts (ONNX, TensorRT, Docker, REST APIs, edge devices)
    - Experience working with time-critical or high-throughput inspection systems
    - Background in electronics, semiconductors, or similar domains is an advantage
    - Experience preparing client-facing reports and presenting technical results to non-ML audiences

    We offer:
    - Free English classes with a native speaker and external courses compensation;
    - PE support by professional accountants;
    - 40 days of PTO;
    - Medical insurance;
    - Team-building events, conferences, meetups, and other activities;
    - There are many other benefits you’ll find out at the interview.

    More
  • Β· 33 views Β· 7 applications Β· 7d

    Senior Data Engineer

    Full Remote Β· Worldwide Β· Product Β· 5 years of experience Β· English - B2
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 AI top engineers.

    About the client:
    We are working with a new generation of data service provider, specializing in data consulting and data-driven digital marketing, dedicated to transforming data into business impact across the entire value chain of organizations. The company’s data-driven services are built upon the deep AI expertise the company’s acquired with a 1000+ client base around the globe. The company has 1000 employees across 20 offices who are focused on accelerating digital transformation.

    About the role:
    We are seeking a Senior Data Engineer (Azure) to design and maintain data pipelines and systems for analytics and AI-driven applications. You will work on building reliable ETL/ELT workflows and ensuring data integrity across the organization.

    Required skills:
    - 6+ years of experience as a Data Engineer, preferably in Azure environments.
    - Proficiency in Python, SQL, NoSQL, and Cypher for data manipulation and querying.
    - Hands-on experience with Airflow and Azure Data Services for pipeline orchestration.
    - Strong understanding of data modeling, ETL/ELT workflows, and data warehousing concepts.
    - Experience in implementing DataOps practices for pipeline automation and monitoring.
    - Knowledge of data governance, data security, and metadata management principles.
    - Ability to work collaboratively with data science and analytics teams.
    - Excellent problem-solving and communication skills.

    Responsibilities:
    - Transform data into formats suitable for analysis by developing and maintaining processes for data transformation;
    - Structuring, metadata management, and workload management.
    - Design, implement, and maintain scalable data pipelines on Azure.
    - Develop and optimize ETL/ELT processes for various data sources.
    - Collaborate with data scientists and analysts to ensure data readiness.
    - Monitor and improve data quality, performance, and governance.

    More
  • Β· 55 views Β· 7 applications Β· 12d

    Computer Vision/Machine Learning Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the role:
    We are looking for a Computer Vision / Machine Learning Engineer to develop offline CV models for industrial visual inspection.


    Your main task will be to design, train, and evaluate models on inspection data in order to:

     

    • Improve discrimination between good vs. defect samples
    • Provide insights into key defect categories (e.g., terminal electrode irregularities, surface chipping)
    • Significantly reduce false-positive rates, optimizing for either precision, or recall
    • Prepare the solution for future deployment, scaling, and maintenance
    •  

    Key Responsibilities:
    Data Analysis & Preparation
    - Conduct dataset audits, including class balance checks and sample quality reviews
    - Identify low-frequency defect classes and outliers
    - Design and implement augmentation strategies for rare defects and edge cases
    Model Development & Evaluation
    - Train deep-learning models on inspection images for defect detection
    - Use modern computer vision / deep learning frameworks (e.g., PyTorch, TensorFlow)
    - Evaluate models using confusion matrices, ROC curves, precision–recall curves, F1 scores and other relevant metrics
    - Analyze false positives/false negatives and propose thresholds or model improvements
    Reporting & Communication
    - Prepare clear offline performance reports and model evaluation summaries
    - Explain classifier decisions, limitations, and reliability in simple, non-technical language when needed
    - Provide recommendations for scalable deployment in later phases (e.g., edge / on-prem inference, integration patterns)

    Candidate Requirements:
    Must-have:
    - 1-2 years of hands-on experience with computer vision and deep learning (classification, detection, or segmentation)
    - Strong proficiency in Python and at least one major DL framework (PyTorch or TensorFlow/Keras)
    - Solid understanding of:

    • Image preprocessing and augmentation techniques
    • Classification metrics: accuracy, precision, recall, F1, confusion matrix, ROC, PR curves
    • Handling imbalanced datasets and low-frequency classes

    - Experience training and evaluating offline models on real production or near-production datasets
    - Ability to structure and document experiments, compare baselines, and justify design decisions
    - Strong analytical and problem-solving skills; attention to detail in data quality and labelling
    - Good communication skills in English (written and spoken) to interact with internal and client stakeholders

    Nice-to-have:
    - Experience with industrial / manufacturing computer vision (AOI, quality inspection, defect detection, etc.)
    - Familiarity with ML Ops/deployment concepts (ONNX, TensorRT, Docker, REST APIs, edge devices)
    - Experience working with time-critical or high-throughput inspection systems
    - Background in electronics, semiconductors, or similar domains is an advantage
    - Experience preparing client-facing reports and presenting technical results to non-ML audiences

    We offer:
    - Free English classes with a native speaker and external courses compensation;
    - PE support by professional accountants;
    - 40 days of PTO;
    - Medical insurance;
    - Team-building events, conferences, meetups, and other activities;
    - There are many other benefits you’ll find out at the interview.

    More
  • Β· 48 views Β· 9 applications Β· 12d

    Senior Data Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 5 years of experience Β· English - None
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered the largest Data Science Community in Eastern Europe, boasting a network of over 30,000 AI top engineers.

    About the client:
    We are working with a new generation of data service provider, specializing in data consulting and data-driven digital marketing, dedicated to transforming data into business impact across the entire value chain of organizations. The company’s data-driven services are built upon the deep AI expertise the company’s acquired with a 1000+ client base around the globe. The company has 1000 employees across 20 offices who are focused on accelerating digital transformation.

    About the role:
    We are seeking a Senior Data Engineer (Azure) to design and maintain data pipelines and systems for analytics and AI-driven applications. You will work on building reliable ETL/ELT workflows and ensuring data integrity across the organization.

    Required skills:
    - 6+ years of experience as a Data Engineer, preferably in Azure environments.
    - Proficiency in Python, SQL, NoSQL, and Cypher for data manipulation and querying.
    - Hands-on experience with Airflow and Azure Data Services for pipeline orchestration.
    - Strong understanding of data modeling, ETL/ELT workflows, and data warehousing concepts.
    - Experience in implementing DataOps practices for pipeline automation and monitoring.
    - Knowledge of data governance, data security, and metadata management principles.
    - Ability to work collaboratively with data science and analytics teams.
    - Excellent problem-solving and communication skills.

    Responsibilities:
    - Transform data into formats suitable for analysis by developing and maintaining processes for data transformation;
    - Structuring, metadata management, and workload management.
    - Design, implement, and maintain scalable data pipelines on Azure.
    - Develop and optimize ETL/ELT processes for various data sources.
    - Collaborate with data scientists and analysts to ensure data readiness.
    - Monitor and improve data quality, performance, and governance.

    More
  • Β· 66 views Β· 3 applications Β· 12d

    Data Engineer

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - None
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.

    You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.

    Requirements:
    - Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
    - NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
    Understanding of FineWeb2 or a similar processing pipeline approach.
    - Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
    - Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
    - Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
    - Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
    - Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
    - Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.

    Nice to have:
    - Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
    - Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
    - CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
    - Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
    - Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimizing existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve the workflows.

    Responsibilities:
    - Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
    - Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
    - Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
    - Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
    - Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    - Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
    - Automate data processing workflows and ensure their scalability and reliability.
    - Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
    - Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
    - Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
    - Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
    - Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
    - Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
    - Manage data security, access, and compliance.
    - Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 74 views Β· 2 applications Β· 14d

    Computer Vision/Machine Learning Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with the organization of the first Data Science UA conference, setting the foundation for our growth. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the role:
    We are looking for a Computer Vision / Machine Learning Engineer to develop offline CV models for industrial visual inspection.


    Your main task will be to design, train, and evaluate models on inspection data in order to:

     

    • Improve discrimination between good vs. defect samples
    • Provide insights into key defect categories (e.g., terminal electrode irregularities, surface chipping)
    • Significantly reduce false-positive rates, optimizing for either precision, or recall
    • Prepare the solution for future deployment, scaling, and maintenance
    •  

    Key Responsibilities:
    Data Analysis & Preparation
    - Conduct dataset audits, including class balance checks and sample quality reviews
    - Identify low-frequency defect classes and outliers
    - Design and implement augmentation strategies for rare defects and edge cases
    Model Development & Evaluation
    - Train deep-learning models on inspection images for defect detection
    - Use modern computer vision / deep learning frameworks (e.g., PyTorch, TensorFlow)
    - Evaluate models using confusion matrices, ROC curves, precision–recall curves, F1 scores and other relevant metrics
    - Analyze false positives/false negatives and propose thresholds or model improvements
    Reporting & Communication
    - Prepare clear offline performance reports and model evaluation summaries
    - Explain classifier decisions, limitations, and reliability in simple, non-technical language when needed
    - Provide recommendations for scalable deployment in later phases (e.g., edge / on-prem inference, integration patterns)

    Candidate Requirements:
    Must-have:
    - 1-2 years of hands-on experience with computer vision and deep learning (classification, detection, or segmentation)
    - Strong proficiency in Python and at least one major DL framework (PyTorch or TensorFlow/Keras)
    - Solid understanding of:

    • Image preprocessing and augmentation techniques
    • Classification metrics: accuracy, precision, recall, F1, confusion matrix, ROC, PR curves
    • Handling imbalanced datasets and low-frequency classes

    - Experience training and evaluating offline models on real production or near-production datasets
    - Ability to structure and document experiments, compare baselines, and justify design decisions
    - Strong analytical and problem-solving skills; attention to detail in data quality and labelling
    - Good communication skills in English (written and spoken) to interact with internal and client stakeholders

    Nice-to-have:
    - Experience with industrial / manufacturing computer vision (AOI, quality inspection, defect detection, etc.)
    - Familiarity with ML Ops/deployment concepts (ONNX, TensorRT, Docker, REST APIs, edge devices)
    - Experience working with time-critical or high-throughput inspection systems
    - Background in electronics, semiconductors, or similar domains is an advantage
    - Experience preparing client-facing reports and presenting technical results to non-ML audiences

    We offer:
    - Free English classes with a native speaker and external courses compensation;
    - PE support by professional accountants;
    - 40 days of PTO;
    - Medical insurance;
    - Team-building events, conferences, meetups, and other activities;
    - There are many other benefits you’ll find out at the interview.

    More
  • Β· 17 views Β· 0 applications Β· 19d

    Senior Data Scientist

    Ukraine Β· Product Β· 5 years of experience Β· English - B2
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    The company is a trailblazer in the world of data-driven advertising, known for its innovative approach to optimizing ad placements and campaign effectiveness through advanced analytics and machine learning techniques. Our mission is to revolutionize the advertising sector by enabling brands to reach their audiences more effectively.

    About the role:
    We are seeking an experienced and motivated Senior Data Scientist to join our dynamic team. The ideal candidate will have deep expertise in supervised learning, reinforcement learning, and optimization techniques. You will play a pivotal role in developing and implementing advanced machine learning models, driving actionable insights, and optimizing our advertising solutions.
    This position is based in Ukraine. The team primarily works remotely, with occasional in-person meetings in the Kyiv or Lviv office.

    Responsibilities:
    - Develop and implement advanced supervised and reinforcement learning models to improve ad targeting and campaign performance.
    - Collaborate with cross-functional teams to identify opportunities for leveraging machine learning and optimization techniques to solve business problems.
    - Conduct extensive data analysis and feature engineering to prepare datasets for machine learning tasks.
    - Apply optimization algorithms to enhance the effectiveness and efficiency of advertising campaigns.
    - Evaluate and refine existing models to enhance their accuracy, efficiency, and scalability.
    - Utilize statistical techniques and machine learning algorithms to analyze large and complex datasets.
    - Communicate findings and recommendations effectively to both technical and non-technical stakeholders.
    - Stay updated with the latest advancements in machine learning, reinforcement learning, and optimization techniques.
    - Work with engineering teams to integrate models into production systems.
    - Monitor, troubleshoot, and improve the performance of deployed models.
    - Mentor junior data scientists and contribute to the continuous improvement of the data science practice within the company.

    Requirements:
    - 5+ years of experience in data science or machine learning roles, with a strong focus on supervised learning, reinforcement learning, and optimization techniques.
    - Technical Skills:
    - Proficiency in Python.
    - Strong understanding of working with relational databases and SQL.
    - Experience with machine learning libraries such as scikit-learn, TensorFlow, PyTorch, or similar.
    - Deep understanding of statistical modeling and supervised learning algorithms (e.g., linear regression, logistic regression, decision trees, random forests, SVMs, gradient boosting, neural networks).
    - Hands-on experience with reinforcement learning algorithms and frameworks like OpenAI Gym.
    - Practical experience with optimization algorithms (linear, non-linear, combinatorial, etc.).
    - Hands-on experience with data manipulation tools and libraries (e.g., pandas, NumPy).
    - Familiarity with cloud services, specifically AWS, is a plus.
    - Practical experience building and managing cloud-based ML pipelines using AWS services (e.g. SageMaker, Bedrock) is a plus.
    - Education:
    - Bachelor's or Master's degree in Computer Science, Statistics, Mathematics, Engineering, or a related field. A PhD is a plus.

    Other Skills:
    - Strong analytical and problem-solving skills.
    - Excellent communication skills, with the ability to clearly articulate complex concepts to diverse audiences.
    - Ability to work in a fast-paced environment and manage multiple priorities.
    - Strong organizational skills and attention to detail.
    - Ability to mentor and guide junior data scientists.
    - Must be able to communicate with U.S.-based teams

    The company offers:
    - An opportunity to be at the forefront of advertising technology, impacting major marketing decisions.
    - A collaborative, innovative environment where your contributions make a difference.
    - The chance to work with a passionate team of data scientists, engineers, product managers, and designers.
    - A culture that values learning, growth, and the pursuit of excellence.

    More
  • Β· 26 views Β· 2 applications Β· 26d

    Senior/Middle Data Scientist

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - B1
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of...

    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for an experienced Senior/Middle Data Scientist with a passion for Large Language Models (LLMs) and cutting-edge AI research. In this role, you will design and implement a state-of-the-art evaluation and benchmarking framework to measure and guide model quality, and personally train LLMs with a strong focus on Reinforcement Learning from Human Feedback (RLHF). You will work alongside top AI researchers and engineers, ensuring the models are not only powerful but also aligned with user needs, cultural context, and ethical standards.

    Requirements:
    Education & Experience:
    - 3+ years of experience in Data Science or Machine Learning, preferably with a focus on NLP.
    - Proven experience in machine learning model evaluation and/or NLP benchmarking.
    - Advanced degree (Master’s or PhD) in Computer Science, Computational Linguistics, Machine Learning, or a related field is highly preferred.
    NLP Expertise:
    - Good knowledge of natural language processing techniques and algorithms.
    - Hands-on experience with modern NLP approaches, including embedding models, semantic search, text classification, sequence tagging (NER), transformers/LLMs, RAGs.
    - Familiarity with LLM training and fine-tuning techniques.
    ML & Programming Skills:
    - Proficiency in Python and common data science and NLP libraries (pandas, NumPy, scikit-learn, spaCy, NLTK, langdetect, fasttext).
    - Strong experience with deep learning frameworks such as PyTorch or TensorFlow for building NLP models.
    - Solid understanding of RLHF concepts and related techniques (preference modeling, reward modeling, reinforcement learning).
    - Ability to write efficient, clean code and debug complex model issues.
    Data & Analytics:
    - Solid understanding of data analytics and statistics.
    - Experience creating and managing test datasets, including annotation and labeling processes.
    - Experience in experimental design, A/B testing, and statistical hypothesis testing to evaluate model performance.
    - Comfortable working with large datasets, writing complex SQL queries, and using data visualization to inform decisions.
    Deployment & Tools:
    - Experience deploying machine learning models in production (e.g., using REST APIs or batch pipelines) and integrating with real-world applications.
    - Familiarity with MLOps concepts and tools (version control for models/data, CI/CD for ML).
    - Experience with cloud platforms (AWS, GCP, or Azure) and big data technologies (Spark, Hadoop, Ray, Dask) for scaling data processing or model training.
    Communication:
    - Experience working in a collaborative, cross-functional environment.
    - Strong communication skills to convey complex ML results to non-technical stakeholders and to document methodologies clearly.

    Nice to have:
    Advanced NLP/ML Techniques:
    - Prior work on LLM safety, fairness, and bias mitigation.
    - Familiarity with evaluation metrics for language models (perplexity, BLEU, ROUGE, etc.) and with techniques for model optimization (quantization, knowledge distillation) to improve efficiency.
    - Knowledge of data annotation workflows and human feedback collection methods.
    Research & Community:
    - Publications in NLP/ML conferences or contributions to open-source NLP projects.
    - Active participation in the AI community or demonstrated continuous learning (e.g., Kaggle competitions, research collaborations) indicating a passion for staying at the forefront of the field.
    Domain & Language Knowledge:
    - Familiarity with the Ukrainian language and context.
    - Understanding of cultural and linguistic nuances that could inform model training and evaluation in a Ukrainian context.
    - Knowledge of Ukrainian benchmarks, or familiarity with other evaluation datasets and leaderboards for large models, can be an advantage given the project’s focus.
    MLOps & Infrastructure:
    - Hands-on experience with containerization (Docker) and orchestration (Kubernetes) for ML, as well as ML workflow tools (MLflow, Airflow).
    - Experience in working alongside MLOps engineers to streamline the deployment and monitoring of NLP models.
    Problem-Solving:
    - Innovative mindset with the ability to approach open-ended AI problems creatively.
    - Comfort in a fast-paced R&D environment where you can adapt to new challenges, propose solutions, and drive them to implementation.

    Responsibilities:
    - Analyze benchmarking datasets, define gaps, and design, implement, and maintain a comprehensive benchmarking framework for the Ukrainian language.
    - Research and integrate state-of-the-art evaluation metrics for factual accuracy, reasoning, language fluency, safety, and alignment.
    - Design and maintain testing frameworks to detect hallucinations, biases, and other failure modes in LLM outputs.
    - Develop pipelines for synthetic data generation and adversarial example creation to challenge the model’s robustness.
    - Collaborate with human annotators, linguists, and domain experts to define evaluation tasks and collect high-quality feedback
    - Develop tools and processes for continuous evaluation during model pre-training, fine-tuning, and deployment.
    - Research and develop best practices and novel techniques in LLM training pipelines.
    - Analyze benchmarking results to identify model strengths, weaknesses, and improvement opportunities.
    - Work closely with other data scientists to align training and evaluation pipelines.
    - Document methodologies and share insights with internal teams.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 33 views Β· 7 applications Β· 27d

    SMM Manager

    Full Remote Β· Ukraine Β· Product Β· 1 year of experience Β· English - B2
    Data Science UA is a service company with deep expertise in AI and Data Science. Our story started in 2016 with the first Data Science UA Conference in Kyiv, and since then, we’ve built one of the largest AI communities in Europe. Today, we help...

    Data Science UA is a service company with deep expertise in AI and Data Science. Our story started in 2016 with the first Data Science UA Conference in Kyiv, and since then, we’ve built one of the largest AI communities in Europe.

    Today, we help businesses worldwide implement AI solutions, grow through data, and connect with top tech talent.

     

    Who we’re looking for:
    A content-driven, curious, and creative SMM Manager who gets that social media today is not just about β€œposting.”
    It’s about stories that hook, visuals that pop, and insights that make people stop scrolling.
    Your main playgrounds will be LinkedIn, Instagram, Telegram, and Facebook. You’ll create content for both business leaders and our AI community, mixing expert storytelling with a human touch.

     

    What you’ll do:
    – Come up with ideas that people actually want to read and watch.
    – Run Data Science UA’s social media pages (LinkedIn, Instagram, Telegram, Facebook).
    – Write posts in English that sound smart, but not boring.
    – Create visuals and short videos (hello, Canva & CapCut).
    – Keep track of analytics – figure out what works and double down on it.
    – Brainstorm with our Brand Strategy Lead and PR team.
    – Keep an eye on trends, and catch them before they go mainstream.

    We’ll be a match if you:
    – Have at least 1 year of experience in SMM.
    – Know your way around LinkedIn (personal or company pages)
    – Write English confidently (Upper-Intermediate+).
    – Can create not just posts, but real value.
    – Work with basic design and video tools (Canva, Figma, CapCut, Filmora – whatever you like).
    – Understand that analytics is not scary, it’s your best friend.
    – Have ideas and aren’t afraid to pitch them.
    – Are curious to try yourself in other marketing directions: from PR and events to community and creative campaigns.

    Nice-to-have:
    – Experience with IT or tech brands.
    – Understanding of Meta Ads & targeting.
    – Interest in AI or data-driven marketing.
    – Experience in prompting ChatGPT, Midjourney, or other generative AI tools.
    – Skills in motion design or meme sorcery.

    What you’ll get:
    – Real impact: your content will directly attract new clients.
    – Mentorship and growth: you’ll work side by side with our Brand Strategy Lead and ex-SMM manager.
    – Freedom to experiment and test new formats.
    – Flexible schedule and remote-first culture.
    – Warm team vibes (and an open invite to our Kyiv office for coffee β˜•)

    If you can turn complex things into content that makes people say β€œDamn, that’s smart,” we’d love to see your portfolio πŸ‘€

    Apply now and let’s make some noise in the AI world together.

    More
  • Β· 41 views Β· 3 applications Β· 28d

    Data Engineer

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· English - None
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.

    You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.

    Requirements:
    - Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
    - NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
    Understanding of FineWeb2 or a similar processing pipeline approach.
    - Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
    - Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
    - Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
    - Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
    - Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
    - Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.

    Nice to have:
    - Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
    - Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
    - CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
    - Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
    - Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimizing existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve the workflows.

    Responsibilities:
    - Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
    - Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
    - Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
    - Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
    - Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    - Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
    - Automate data processing workflows and ensure their scalability and reliability.
    - Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
    - Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
    - Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
    - Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
    - Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
    - Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
    - Manage data security, access, and compliance.
    - Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 3 views Β· 1 application Β· 3d

    Sales Manager

    Office Work Β· Ukraine (Kyiv) Β· Product Β· 1 year of experience Β· English - B2
    Data Science UA Ρ” ΡΠ΅Ρ€Π²Ρ–ΡΠ½ΠΎΡŽ ΠΊΠΎΠΌΠΏΠ°Π½Ρ–Ρ”ΡŽ Π· Π²Π΅Π»ΠΈΠΊΠΈΠΌ досвідом Ρ‚Π° Π΅ΠΊΡΠΏΠ΅Ρ€Ρ‚ΠΈΠ·ΠΎΡŽ Ρƒ сфСрах Data Science Ρ‚Π° AI. Історія ΠΊΠΎΠΌΠΏΠ°Π½Ρ–Ρ— почалася Ρƒ 2016 Ρ€ΠΎΡ†Ρ– Π· ΠΎΡ€Π³Π°Π½Ρ–Π·Π°Ρ†Ρ–Ρ— ΠΏΠ΅Ρ€ΡˆΠΎΡ— Data Science UA ΠΊΠΎΠ½Ρ„Π΅Ρ€Π΅Π½Ρ†Ρ–Ρ—, яка Π·Π°ΠΊΠ»Π°Π»Π° Ρ„ΡƒΠ½Π΄Π°ΠΌΠ΅Π½Ρ‚ для нашого Ρ€ΠΎΠ·Π²ΠΈΡ‚ΠΊΡƒ. Π—Π° останні 9 Ρ€ΠΎΠΊΡ–Π² ΠΌΠΈ...

    Data Science UA Ρ” ΡΠ΅Ρ€Π²Ρ–ΡΠ½ΠΎΡŽ ΠΊΠΎΠΌΠΏΠ°Π½Ρ–Ρ”ΡŽ Π· Π²Π΅Π»ΠΈΠΊΠΈΠΌ досвідом Ρ‚Π° Π΅ΠΊΡΠΏΠ΅Ρ€Ρ‚ΠΈΠ·ΠΎΡŽ Ρƒ сфСрах Data Science Ρ‚Π° AI. Історія ΠΊΠΎΠΌΠΏΠ°Π½Ρ–Ρ— почалася Ρƒ 2016 Ρ€ΠΎΡ†Ρ– Π· ΠΎΡ€Π³Π°Π½Ρ–Π·Π°Ρ†Ρ–Ρ— ΠΏΠ΅Ρ€ΡˆΠΎΡ— Data Science UA ΠΊΠΎΠ½Ρ„Π΅Ρ€Π΅Π½Ρ†Ρ–Ρ—, яка Π·Π°ΠΊΠ»Π°Π»Π° Ρ„ΡƒΠ½Π΄Π°ΠΌΠ΅Π½Ρ‚ для нашого Ρ€ΠΎΠ·Π²ΠΈΡ‚ΠΊΡƒ. Π—Π° останні 9 Ρ€ΠΎΠΊΡ–Π² ΠΌΠΈ створили Π½Π°ΠΉΠ±Ρ–Π»ΡŒΡˆΡƒ Data Science ΡΠΏΡ–Π»ΡŒΠ½ΠΎΡ‚Ρƒ Ρƒ Π‘Ρ…Ρ–Π΄Π½Ρ–ΠΉ Π„Π²Ρ€ΠΎΠΏΡ–.

    ΠŸΡ€ΠΎ ΠΊΠ»Ρ–Ρ”Π½Ρ‚Π°:
    Наш ΠΊΠ»Ρ–Ρ”Π½Ρ‚ Ρ€ΠΎΠ·ΠΏΠΎΡ‡Π°Π² свою Π΄Ρ–ΡΠ»ΡŒΠ½Ρ–ΡΡ‚ΡŒ Π· виготовлСння ΠΌΠ΅Ρ‚Π°Π»Π΅Π²ΠΈΡ… Ρ– скляних Π°Ρ€Ρ…Ρ–Ρ‚Π΅ΠΊΡ‚ΡƒΡ€Π½ΠΈΡ… конструкцій Ρƒ 1996. А Π²ΠΆΠ΅ ΡΡŒΠΎΠ³ΠΎΠ΄Π½Ρ– ця компанія - Ρ†Π΅ високотСхнологічний Π²ΠΈΡ€ΠΎΠ±Π½ΠΈΡ‡ΠΈΠΉ комплСкс, Ρ‰ΠΎ ΠΊΠΎΠ½ΡΡ‚Ρ€ΡƒΡŽΡ” Ρ‚Π° виготовляє Ρ‚ΡƒΡ€Π½Ρ–ΠΊΠ΅Ρ‚ΠΈ, Π±Π»ΠΎΠΊΡƒΠ²Π°Π»ΡŒΠ½ΠΈΠΊΠΈ Π΄ΠΎΡ€Ρ–Π³, Π±ΠΎΠ»Π°Ρ€Π΄ΠΈ, ΠΏΡ€ΠΎΡ‚ΠΈΠΏΠΎΠΆΠ΅ΠΆΠ½Ρ– Π΄Π²Π΅Ρ€Ρ–, Π²ΠΎΡ€ΠΎΡ‚Π° Ρ‚Π° люки.
    ΠœΡ–ΠΆΠ½Π°Ρ€ΠΎΠ΄Π½Ρ– офіси ΠΊΠΎΠΌΠΏΠ°Π½Ρ–Ρ— Ρ€ΠΎΠ·Ρ‚Π°ΡˆΠΎΠ²Π°Π½Ρ– Ρƒ ΠšΠΈΡ”Π²Ρ–, Π›ΠΎΠ½Π΄ΠΎΠ½Ρ– Ρ‚Π° Π”ΡƒΠ±Π°Ρ—. ΠšΡ€Ρ–ΠΌ Ρ†ΡŒΠΎΠ³ΠΎ Π²ΠΎΠ½ΠΈ ΠΌΠ°ΡŽΡ‚ΡŒ 23 ΠΎΡ„Ρ–Ρ†Ρ–ΠΉΠ½ΠΈΡ… прСдставників Π±Ρ€Π΅Π½Π΄Ρƒ ΠΏΠΎ Π²ΡΡŒΠΎΠΌΡƒ світу; 30 000+ Ρ€Π΅Π°Π»Ρ–Π·ΠΎΠ²Π°Π½ΠΈΡ… ΠΏΡ€ΠΎΡ”ΠΊΡ‚Ρ–Π² Ρƒ ΠΏΠΎΠ½Π°Π΄ 100 ΠΊΡ€Π°Ρ—Π½Π°Ρ… світу.

    ΠŸΡ€ΠΎ Ρ€ΠΎΠ»ΡŒ:
    Ми ΡˆΡƒΠΊΠ°Ρ”ΠΌΠΎ Sales Manager, який Π΄ΠΎΡ”Π΄Π½Π°Ρ”Ρ‚ΡŒΡΡ Π΄ΠΎ ΠΊΠΎΠΌΠ°Π½Π΄ΠΈ Π² офісі Ρƒ ΠšΠΈΡ”Π²Ρ– (Π‘ΠΎΡ€Ρ‚Π½ΠΈΡ‡Ρ–).

    Π’ΠΈΠΌΠΎΠ³ΠΈ:
    - Π“Π°Ρ€Π½Π΅ володіння Π°Π½Π³Π»Ρ–ΠΉΡΡŒΠΊΠΎΡŽ мовою, як усно, Ρ‚Π°ΠΊ Ρ– письмово.
    - Досвід Ρ€ΠΎΠ±ΠΎΡ‚ΠΈ Π² області ΠΏΡ€ΠΎΠ΄Π°ΠΆΡ–Π² Ρ‚Π° ΠΊΠ»Ρ–Ρ”Π½Ρ‚ΡΡŒΠΊΠΎΠ³ΠΎ обслуговування.
    - Π’Ρ–Π΄ΠΌΡ–Π½Π½Ρ– Π½Π°Π²ΠΈΡ‡ΠΊΠΈ встановлСння відносин Ρ‚Π° ΠΊΠΎΠΌΡƒΠ½Ρ–ΠΊΠ°Ρ†Ρ–Ρ—.
    - Вміння Ρ–Π½Ρ–Ρ†Ρ–ΡŽΠ²Π°Ρ‚ΠΈ Ρ‚Π° Ρ€ΠΎΠ·Π²ΠΈΠ²Π°Ρ‚ΠΈ бізнСс-відносини.
    - Π—Π΄Π°Ρ‚Π½Ρ–ΡΡ‚ΡŒ ΠΏΡ€Π°Ρ†ΡŽΠ²Π°Ρ‚ΠΈ Π² ΠΊΠΎΠΌΠ°Π½Π΄Ρ– Ρ‚Π° ΡΠΏΡ–Π²ΠΏΡ€Π°Ρ†ΡŽΠ²Π°Ρ‚ΠΈ Π· Ρ–Π½ΡˆΠΈΠΌΠΈ Π²Ρ–Π΄Π΄Ρ–Π»Π°ΠΌΠΈ.
    - Високий Ρ€Ρ–Π²Π΅Π½ΡŒ впСвнСності Ρƒ прСзСнтаціях Ρ‚Π° ΠΏΠ΅Ρ€Π΅Π³ΠΎΠ²ΠΎΡ€Π°Ρ….
    - Π”ΠΎΠ±Ρ€Π΅ розуміння Ρ‚Π΅Ρ…Π½Ρ–Ρ‡Π½ΠΈΡ… аспСктів ΠΏΡ€ΠΎΠ΄ΡƒΠΊΡ†Ρ–Ρ—.

    ΠžΡΠ½ΠΎΠ²Π½Ρ– обов’язки:
    - Відносини Π· ΠΊΠ»Ρ–Ρ”Π½Ρ‚Π°ΠΌΠΈ: Π’ΡΡ‚Π°Π½ΠΎΠ²Π»ΡŽΠ²Π°Ρ‚ΠΈ, Ρ€ΠΎΠ·Π²ΠΈΠ²Π°Ρ‚ΠΈ Ρ‚Π° ΠΏΡ–Π΄Ρ‚Ρ€ΠΈΠΌΡƒΠ²Π°Ρ‚ΠΈ Π΄Ρ–Π»ΠΎΠ²Ρ– Π·Π²'язки Π· наявними Ρ‚Π° ΠΏΠΎΡ‚Π΅Π½Ρ†Ρ–ΠΉΠ½ΠΈΠΌΠΈ ΠΊΠ»Ρ–Ρ”Π½Ρ‚Π°ΠΌΠΈ Π² Π΄ΠΎΠ²Ρ–Ρ€Π΅Π½ΠΈΡ… Π³Π΅ΠΎΠ³Ρ€Π°Ρ„Ρ–Ρ‡Π½ΠΈΡ… Ρ€Π΅Π³Ρ–ΠΎΠ½Π°Ρ….
    - Π ΠΎΠ·Π²ΠΈΡ‚ΠΎΠΊ бізнСсу: Активно виявляти Π½ΠΎΠ²Ρ– моТливості для досягнСння особистих Ρ‚Π° Ρ†Ρ–Π»Π΅ΠΉ Π²Ρ–Π΄Π΄Ρ–Π»Ρƒ ΠΏΡ€ΠΎΠ΄Π°ΠΆΡ–Π². Π—Π±Ρ–Π»ΡŒΡˆΠ΅Π½Π½Ρ обсягів ΠΏΡ€ΠΎΠ΄Π°ΠΆΡ–Π² Ρ‚Π° ΠΏΠΎΡˆΡƒΠΊ Π½ΠΎΠ²ΠΈΡ… ΠΊΠ»Ρ–Ρ”Π½Ρ‚Ρ–Π² Ρ‚Π° ΠΏΠ°Ρ€Ρ‚Π½Π΅Ρ€Ρ–Π² Ρƒ Π΄ΠΎΠ²Ρ–Ρ€Π΅Π½ΠΎΠΌΡƒ Π³Π΅ΠΎΠ³Ρ€Π°Ρ„Ρ–Ρ‡Π½ΠΎΠΌΡƒ Ρ€Π΅Π³Ρ–ΠΎΠ½Ρ–.
    - ΠŸΡ€Π΅Π·Π΅Π½Ρ‚Π°Ρ†Ρ–Ρ— Ρ‚Π° ΠŸΡ–Π΄Ρ‚Ρ€ΠΈΠΌΠΊΠ°: ΠŸΡ€ΠΎΠ²ΠΎΠ΄ΠΈΡ‚ΠΈ ΠΏΡ€Π΅Π·Π΅Π½Ρ‚Π°Ρ†Ρ–Ρ— ΠΏΡ€ΠΎ ΠΊΠΎΠΌΠΏΠ°Π½Ρ–ΡŽ Ρ‚Π° Ρ—Ρ— обладнання. Надавати ΠΏΡ–Π΄Ρ‚Ρ€ΠΈΠΌΠΊΡƒ ΠΊΠ»Ρ–Ρ”Π½Ρ‚Π°ΠΌ, Π½Π° всіх Π΅Ρ‚Π°ΠΏΠ°Ρ… співпраці - Π²Ρ–Π΄ ΠΏΠ΅Ρ€ΡˆΠΎΠ³ΠΎ звСрнСння Π΄ΠΎ післяпродаТної ΠΏΡ–Π΄Ρ‚Ρ€ΠΈΠΌΠΊΠΈ.
    - ΠŸΡ€ΠΎΡ”ΠΊΡ‚Π½Π° Ρ€ΠΎΠ±ΠΎΡ‚Π°: ΠŸΡ–Π΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠ° ΠΏΡ€ΠΎΡ”ΠΊΡ‚Π½ΠΈΡ… ΠΏΡ€ΠΎΠΏΠΎΠ·ΠΈΡ†Ρ–ΠΉ ΡΠΏΡ–Π»ΡŒΠ½ΠΎ Π· Ρ‚Π΅Ρ…Π½Ρ–Ρ‡Π½ΠΈΠΌΠΈ Ρ‚Π° Ρ–Π½ΠΆΠ΅Π½Π΅Ρ€Π½ΠΈΠΌΠΈ ΠΊΠΎΠΌΠ°Π½Π΄Π°ΠΌΠΈ ΠΊΠΎΠΌΠΏΠ°Π½Ρ–Ρ—.
    - УзгодТСння ΡƒΠΌΠΎΠ²: ΠžΠ±Π³ΠΎΠ²ΠΎΡ€Π΅Π½Π½Ρ ΡƒΠΌΠΎΠ² ΠΏΡ€ΠΎΡ”ΠΊΡ‚Ρƒ Ρ‚Π° ΠΊΠΎΠ½Ρ‚Ρ€Π°ΠΊΡ‚Ρ–Π² для максимізації ΠΏΡ€ΠΈΠ±ΡƒΡ‚ΠΊΡƒ.

    ΠšΠ»Ρ–Ρ”Π½Ρ‚ ΠΏΡ€ΠΎΠΏΠΎΠ½ΡƒΡ”:
    - Π”ΡƒΠΆΠ΅ Π΄ΠΈΠ½Π°ΠΌΡ–Ρ‡Π½Ρƒ Ρ‚Π° Π΄Ρ€ΡƒΠΆΠ½ΡŽ Ρ€ΠΎΠ±ΠΎΡ‡Ρƒ атмосфСру.
    - Π‘ΠΎΡ†Ρ–Π°Π»ΡŒΠ½ΠΈΠΉ ΠΏΠ°ΠΊΠ΅Ρ‚: ΠΎΡ„Ρ–Ρ†Ρ–ΠΉΠ½Π΅ ΠΏΡ€Π°Ρ†Π΅Π²Π»Π°ΡˆΡ‚ΡƒΠ²Π°Π½Π½Ρ, відпустка Ρ‚Π° лікарняні Π·Π³Ρ–Π΄Π½ΠΎ Π· ΠšΠ—ΠΏΠŸ.

    More
Log In or Sign Up to see all posted jobs