Jobs Data Engineer

147
  • Β· 594 views Β· 56 applications Β· 6d

    Data Engineer

    Countries of Europe or Ukraine Β· 2 years of experience Β· B1 - Intermediate
    Looking for a Data Engineer to join the Dataforest team. If you are looking for a friendly team, a healthy working environment, and a flexible schedule β€’ you have found the right place to send your CV. Skills requirements: β€’ 2+ years of experience with...

    Looking for a Data Engineer to join the Dataforest team. If you are looking for a friendly team, a healthy working environment, and a flexible schedule β€’ you have found the right place to send your CV.

     

    Skills requirements:
    β€’ 2+ years of experience with Python;
    β€’ 2+ years of experience as a Data Engineer;
    β€’ Experience with Pandas;
    β€’ Experience with SQL DB / NoSQL (Redis, Mongo, Elasticsearch) / BigQuery;
    β€’ Familiarity with Amazon Web Services;
    β€’ Knowledge of data algorithms and data structures is a MUST;
    β€’ Working with high volume tables 10m+.


    Optional skills (as a plus):
    β€’ Experience with Spark (pyspark);
    β€’ Experience with Airflow;
    β€’ Experience with Kafka;
    β€’ Experience in statistics;
    β€’ Knowledge of DS and Machine learning algorithms..

     

    Key responsibilities:
    β€’ Create ETL pipelines and data management solutions (API, Integration logic);
    β€’ Different data processing algorithms;
    β€’ Involvement in creation of forecasting, recommendation, and classification models.

     

    We offer:

    β€’ Great networking opportunities with international clients, challenging tasks;

    β€’ Building interesting projects from scratch using new technologies;

    β€’ Personal and professional development opportunities;

    β€’ Competitive salary fixed in USD;

    β€’ Paid vacation and sick leaves;

    β€’ Flexible work schedule;

    β€’ Friendly working environment with minimal hierarchy;

    β€’ Team building activities, corporate events.

    More
  • Β· 113 views Β· 21 applications Β· 16d

    Data Engineer

    Full Remote Β· Worldwide Β· 5 years of experience Β· B2 - Upper Intermediate
    Lead the development and scaling of our scientific knowledge graphβ€”ingesting, structuring, and enriching massive datasets from research literature and global data sources into meaningful, AI-ready insights. Requirements: - Strong experience with...

    Lead the development and scaling of our scientific knowledge graphβ€”ingesting, structuring, and enriching massive datasets from research literature and global data sources into meaningful, AI-ready insights. 

     

    Requirements: 

    - Strong experience with knowledge graph design and implementation (Neo4j, RDFLib, GraphQL, etc.). 

    - Advanced Python for data engineering, ETL, and entity processing (Spark/Dask/Polars). 

    - Proven track record with large dataset ingestion (tens of millions of records). 

    - Familiarity with life-science or biomedical data (ontologies, research metadata, entity linking). 

    - Experience with Airflow/Dagster/dbt, and data APIs (OpenAlex, ORCID, PubMed). 

    - Strong sense of ownership, precision, and delivery mindset. Nice to Have: 

    - Domain knowledge in life sciences, biomedical research, or related data models. 

    - Experience integrating vector/semantic embeddings (Pinecone, FAISS, Weaviate).

     

    We offer:

    β€’ Attractive financial package

    β€’ Challenging projects

    β€’ Professional & career growth

    β€’ Great atmosphere in a friendly small team

    More
  • Β· 57 views Β· 15 applications Β· 17d

    Senior Data Engineer – (PySpark / Data Infrastructure)

    Full Remote Β· Worldwide Β· Product Β· 5 years of experience Β· C1 - Advanced
    Senior Data Engineer – (PySpark / Data Infrastructure) We're hiring a Senior Data Engineer to help lead the next phase of our data platform’s growth. At Forecasa, we provide enriched real estate transaction data and analytics to private lenders and...

    Senior Data Engineer –  (PySpark / Data Infrastructure)

    We're hiring a Senior Data Engineer to help lead the next phase of our data platform’s growth.

    At Forecasa, we provide enriched real estate transaction data and analytics to private lenders and investors. Our platform processes large volumes of public data, standardizes and enriches it, and delivers actionable insights that drive lending decisions.

    We recently completed a migration from a legacy SQL-based ETL stack (PostgreSQL/dbt) to PySpark, and we're now looking for a senior engineer to take ownership of the new pipeline, maintain and optimize it, and develop new data-driven features to support our customers and internal analytics.

    What You’ll Do

    • Own and maintain our PySpark-based data pipeline, ensuring stability, performance, and scalability.
    • Design and build new data ingestion, transformation, and validation workflows.
    • Optimize and monitor data jobs using Airflow, Kubernetes, and S3.
    • Collaborate with data analysts, product owners, and leadership to define data needs and deliver clean, high-quality data.
    • Support and mentor junior engineers working on scrapers, validation tools, and quality monitoring dashboards.
    • Contribute to the evolution of our data infrastructure and architectural decisions.

    Our Tech Stack

    Python β€’ PySpark β€’ PostgreSQL β€’ dbt β€’ Airflow β€’ S3 β€’ Kubernetes β€’ GitLab β€’ Grafana

    What We’re Looking For

    • 5+ years of experience in data engineering or backend systems with large-scale data processing.
    • Strong experience with PySpark, including building scalable data pipelines and working with large datasets.
    • Solid command of SQL, data modeling, and performance tuning (especially in PostgreSQL).
    • Experience working with orchestration tools like Airflow, and containers via Docker/Kubernetes.
    • Familiarity with cloud storage (preferably S3) and modern CI/CD workflows.
    • Ability to work independently and communicate clearly in a remote, async-first environment.

    Bonus Points

    • Background in real estate or financial data
    • Experience with data quality frameworks or observability tools (e.g., Great Expectations, Grafana, Prometheus)
    • Experience optimizing PySpark jobs for performance and cost-efficiency
    More
  • Β· 13 views Β· 1 application Β· 12d

    Presale engineer

    Full Remote Β· Ukraine Β· Product Β· 2 years of experience Β· A2 - Elementary
    Requirements: Knowledge of the core functionality of virtualization platforms; Experience implementing and migrating workloads in virtualized environment; Experience in complex IT solutions and Hybrid Cloud solution projects. Good understanding of...

    Requirements:

    • Knowledge of the core functionality of virtualization platforms;
    • Experience implementing and migrating workloads in virtualized environment;
    • Experience in complex IT solutions and Hybrid Cloud solution projects.
    • Good understanding of IT-infrastructure services is a plus;
    • Strong knowledge in troubleshooting of complex environments in case of failure;
    • At least basic knowledge in networking & information security is an advantage
    • Hyper-V, Proxmox, VMWare experience would be an advantage;
    • Experience in the area of services outsourcing (as customer and/or provider) is an advantage.
    • Work experience of 2+ years in a similar position
    • Scripting and programming experience/background in PowerShell/Bash is an advantage;
    • Strong team communication skills, both verbal and written;
    • Experience in technical documentation writing and preparation;
    • English skills - intermediate level is minimum and mandatory for global teams communication;
    • Industry certification focused on relevant solution area.

    Areas of Responsibility includes:

    • Participating in deployment and IT-infrastructure migration projects, Hybrid Cloud solution projects; Client support;
    • Consulting regarding migration IT-workloads in complex infrastructures;
    • Presales support (Articulating service value in the sales process) / Up and cross sell capability);
    • Project documentation: technical concepts
    • Education and development in professional area including necessary certifications.
    More
  • Β· 90 views Β· 4 applications Β· 30d

    Data Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 5 years of experience Β· B2 - Upper Intermediate MilTech πŸͺ–
    Who We Are OpenMinds is a cognitive defence tech company countering authoritarian influence in the battle for free and open societies. We work with over 30 governments and organisations worldwide, including Ukraine, the UK, and NATO member governments,...

    Who We Are
     

    OpenMinds is a cognitive defence tech company countering authoritarian influence in the battle for free and open societies. We work with over 30 governments and organisations worldwide, including Ukraine, the UK, and NATO member governments, leading StratCom agencies, and research institutions.

    Our expertise lies in accessing restricted and high-risk environments, including conflict zones and closed platforms.

    We combine ML technologies with deep local expertise. Our team, based in Kyiv, Lviv, London, Ottawa, and Washington, DC, includes behavioural scientists, ML/AI engineers, data journalists, communications experts, and regional specialists.

    Our core values are: speed, experimentation, elegance and focus. We are expanding the team and welcome passionate, proactive, and resourceful professionals who are eager to contribute to the global fight in cognitive warfare.
     

    Who we’re looking for

    OpenMinds is seeking a skilled and curious Data Engineer who’s excited to design and build data systems that power meaningful insight. You’ll work closely with a passionate team of behavioral scientists and ML engineers on creating a robust data infrastructure that supports everything from large-scale narrative tracking to sentiment analysis.
     

    In the position you will:

    • Take ownership of our multi-terabyte data infrastructure, from data ingestion and orchestration to transformation, storage, and lifecycle management
    • Collaborate with data scientists, analysts, ML engineers, and domain experts to develop impactful data solutions
    • Optimize and troubleshoot data infrastructure to ensure high performance, cost-efficiency, scalability, and resilience
    • Stay up-to-date with trends in data engineering and apply modern tools and practices
    • Define and implement best practices for data processing, storage, and governance
    • Translate complex requirements into efficient data workflows that support threat detection and response
       

    We are a perfect match if you have:

    • 5+ years of hands-on experience as a Data Engineer, with a proven track record of leading complex data projects from design to production
    • Highly skilled in SQL and Python for advanced data processing, pipeline development, and optimization
    • Deep understanding of software engineering best practices, including SOLID, error handling, observability, performance tuning, and modular architecture
    • Ability to write, test and deploy production-ready code
    • Extensive experience in database design, data modeling, and modern data warehousing, including ETL orchestration using Airflow or equivalent
    • Familiarity with Google Cloud Platform (GCP) and its data ecosystem (BigQuery, GCS, Pub/Sub, Cloud Run, Cloud Functions, Looker)
    • Open-headed, capable of coming up with creative solutions and adapting to frequently changing circumstances and technological advances
    • Experience in DevOps (docker/k8s, IaaC, CI/CD) and MLOps
    • Fluent in English with excellent communication and cross-functional collaboration skills
       

    We offer:

    • Work in a fast-growing company with proprietary AI technologies, solving the most difficult problems in the domains of social behaviour analytics and national security
    • Competitive market salary
    • Opportunity to present your work on tier 1 conferences, panels, and briefings behind closed doors
    • Work face-to-face with world-leading experts in their fields, who are our partners and friends
    • Flexible work arrangements, including adjustable hours, location, and remote/hybrid options
    • Unlimited vacation and leave policies
    • Opportunities for professional development within a multidisciplinary team, boasting experience from academia, tech, and intelligence sectors
    • A work culture that values resourcefulness, proactivity, and independence, with a firm stance against micromanagement
    More
  • Β· 24 views Β· 5 applications Β· 30d

    Senior ML/GenAI Engineer

    Full Remote Β· Ukraine Β· Product Β· 5 years of experience Β· B2 - Upper Intermediate
    Senior ML Engineer Full-time / Remote About Us ExpoPlatform is a UK-based company founded in 2013, delivering advanced technology for online, hybrid, and in-person events across 30+ countries. Our platform provides end-to-end solutions for event...

    Senior ML Engineer 

    Full-time / Remote 

     

    About Us

    ExpoPlatform is a UK-based company founded in 2013, delivering advanced technology for online, hybrid, and in-person events across 30+ countries. Our platform provides end-to-end solutions for event organizers, including registration, attendee management, event websites, and networking tools.

     

    Role Responsibilities:

    • Develop AI Agents, tools for AI Agents, API as a service
    • Prepare development and deployment documentation
    • Participate in R&D activities of Data Science team

     

    Required Skills & Experience:

    • 5+ years of experience with DL frameworks (PyTorch and/or TensorFlow)
    • 5+ years of experience in software development in Python
    • Hand-on experience with LLM, RAG and AI Agents development
    • Experience with Amazon SageMaker, Amazon Bedrock, LangChain, LangGraph, LangSmith, LlamaIndex, HaggingFace, OpenAI 
    • Hand-on experience of usage AI tools for software development to increase efficiency and code quality, usage AI tools for code review.
    • Knowledge of SQL, non-SQL and vector databases
    • Understanding of embedding vectors  and semantic search
    • Proficiency in Git (Bitbucket) and Docker
    • Upper-Intermediate (B2+) or higher level of English

     

    Would a Plus:

    • Hand-on experience with SLM and LLM fine-tuning
    • Education in Data Science, Computer Science, Applied Math or similar
    • AWS certifications (AWS Certified ML or equivalent)
    • Experience with TypeSense
    • Experience with speech recognition, speech-to-text ML models

     

    What We Offer:

    • Career growth with an international team.
    • Competitive salary and financial stability.
    • Flexible working hours (Mon-Fri, 8 hours).
    • Free English courses and a budget for education


     

    More
  • Β· 63 views Β· 12 applications Β· 12d

    Data Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 4 years of experience Β· B2 - Upper Intermediate
    We are seeking a talented and experienced Data Engineer to join our professional services team of 50+ engineers, on a full-time basis. This remote-first position requires in-depth expertise in data engineering, with a preference for experience in cloud...

    We are seeking a talented and experienced Data Engineer to join our professional services team of 50+ engineers, on a full-time basis. This remote-first position requires in-depth expertise in data engineering, with a preference for experience in cloud platforms like AWS, Google Cloud. You will play a vital role in ensuring the performance, efficiency, and integrity of data pipelines of our customers while contributing to insightful data analysis and utilization.


    About us: 

    Opsfleet is a boutique services company who specializes in cloud infrastructure, data, AI, and human‑behavior analytics to help organizations make smarter decisions and boost performance.

    Our experts provide end‑to‑end solutionsβ€”from data engineering and advanced analytics to DevOpsβ€”ensuring scalable, secure, and AI‑ready platforms that turn insights into action.

     

    Role Overview

    As a Data Engineer at Opsfleet, you will lead the entire data lifecycleβ€”gathering and translating business requirements, ingesting and integrating diverse data sources, and designing, building, and orchestrating robust ETL/ELT pipelines with built‑in quality checks, governance, and observability. You’ll partner with data scientists to prepare, deploy, and monitor ML/AI models in production, and work closely with analysts and stakeholders to transform raw data into actionable insights and scalable intelligence.

     

    What You’ll Do

    * E2E Solution Delivery: Lead the full spectrum of data projectsβ€”requirements gathering, data ingestion, modeling, validation, and production deployment.

    * Data Modeling: Develop and maintain robust logical and physical data modelsβ€”such as star and snowflake schemasβ€”to support analytics, reporting, and scalable data architectures.

    * Data Analysis & BI: Transform complex datasets into clear, actionable insights; develop dashboards and reports that drive operational efficiency and revenue growth.

    * ML Engineering: Implement and manage model‑serving pipelines using cloud’s MLOps toolchain, ensuring reliability and monitoring in production.

    * Collaboration & Research: Partner with cross‑functional teams to prototype solutions, identify new opportunities, and drive continuous improvement.

     

    What We’re Looking For

    Experience: 4+ years in a data‑focused role (Data Engineer, BI Developer, or similar)

    Technical Skills: Proficient in SQL and Python for data manipulation, cleaning, transformation, and ETL workflows. Strong understanding of statistical methods and data modeling concepts. Soft Skills: Excellent problem‑solving ability, critical thinking, and attention to detail. Outstanding written and verbal communication.

    Education: BSc or higher in Mathematics, Statistics, Engineering, Computer Science, Life Science, or a related quantitative discipline.

     

    Nice to Have

    Cloud & Data Warehousing: Hands‑on experience with cloud platforms (GCP, AWS or others) and modern data warehouses such as BigQuery and Snowflake.


     

    More
  • Β· 24 views Β· 2 applications Β· 18d

    Infrastructure Engineer

    Full Remote Β· Countries of Europe or Ukraine Β· 5 years of experience Β· C1 - Advanced
    We are looking for a Senior Infrastructure Engineer to manage and improve our IT systems and cloud environments. You’ll work closely with DevOps and security teams to ensure system availability and reliability. Details: Experience: 5 years Schedule:...

    We are looking for a Senior Infrastructure Engineer to manage and improve our IT systems and cloud environments. You’ll work closely with DevOps and security teams to ensure system availability and reliability.

     

    Details
    Experience: 5 years 
    Schedule: Full time, remote
    Start: ASAP
    English: Fluent
    Employment: B2B Contract

     

    Responsibilities:

    • Design, deploy, and manage infrastructure environments
    • Automate deployments using Terraform, Ansible, etc.
    • Monitor and improve system performance and availability
    • Implement disaster recovery plans
    • Support troubleshooting across environments

     

    Requirements:

    • Strong Linux administration background
    • Experience with AWS, GCP, or Azure
    • Proficiency with containerization tools (Docker, Kubernetes)
    • Infrastructure as Code (IaC) using Terraform or similar
    • Scripting skills in Python, Bash, etc.
    More
  • Β· 8 views Β· 0 applications Β· 24d

    IT Infrastructure Administrator

    Office Work Β· Ukraine (Dnipro) Β· Product Β· 1 year of experience
    Biosphere Corporation is one of the largest producers and distributors of household, hygiene, and professional products in Eastern Europe and Central Asia (TM Freken BOK, Smile, Selpak, Vortex, Novita, PRO service, and many others). We are inviting an IT...

    Biosphere Corporation is one of the largest producers and distributors of household, hygiene, and professional products in Eastern Europe and Central Asia (TM Freken BOK, Smile, Selpak, Vortex, Novita, PRO service, and many others). We are inviting an IT Infrastructure Administrator to join our team.

    Key responsibilities:

    • Administration of Active Directory
    • Managing group policies
    • Managing services via PowerShell
    • Administration of VMWare platform
    • Administration of Azure Active Directory
    • Administration of Exchange 2016/2019 mail servers
    • Administration of Exchange Online
    • Administration of VMWare Horizon View

    Required professional knowledge and skills:

    • Experience in writing automation scripts (PowerShell, Python, etc.)
    • Skills in working with Azure Active Directory (user and group creation, report generation, configuring synchronization between on-premise and cloud AD)
    • Skills in Exchange PowerShell (mailbox creation, search and removal of emails based on criteria, DAG creation and management)
    • Experience with Veeam Backup & Replication, VMWare vSphere (vCenter, DRS, vMotion, HA), VMWare Horizon View
    • Windows Server 2019/2025 (installation, configuration, and adaptation)
    • Diagnostics and troubleshooting
    • Working with anti-spam systems
    • Managing mail transport systems (exim) and monitoring systems (Zabbix)

    We offer:

    • Interesting projects and tasks
    • Competitive salary (discussed during the interview)
    • Convenient work schedule: Mon–Fri, 9:00–18:00; partial remote work possible
    • Official employment, paid vacation, and sick leave
    • Probation period β€” 2 months
    • Professional growth and training (internal training, reimbursement for external training programs)
    • Discounts on Biosphere Corporation products
    • Financial assistance (in cases of childbirth, medical treatment, force majeure, or circumstances caused by wartime events, etc.)

    Office address: Dnipro, Zaporizke Highway 37 (Right Bank, Topol-1 district).

    Learn more about Biosphere Corporation, our strategy, mission, and values at:
    http://biosphere-corp.com/
    https://www.facebook.com/biosphere.corporation/

    Join our team of professionals!

    By submitting your CV for this vacancy, you consent to the use of your personal data in accordance with the current legislation of Ukraine.
    If your application is successful, we will contact you within 1–2 business days.

    More
  • Β· 18 views Β· 0 applications Β· 5d

    Data Engineer (NLP-Focused)

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· B1 - Intermediate
    About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...

    About us:
    Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

    About the client:
    Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

    About the role:
    We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.

    You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.

    Requirements:
    - Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
    - NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
    Understanding of FineWeb2 or a similar processing pipeline approach.
    - Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
    - Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
    - Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
    - Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
    - Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
    - Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.

    Responsibilities:
    - Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
    - Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
    - Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
    - Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
    - Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    - Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
    - Automate data processing workflows and ensure their scalability and reliability.
    - Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
    - Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
    - Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
    - Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
    - Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
    - Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
    - Manage data security, access, and compliance.
    - Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.

    The company offers:
    - Competitive salary.
    - Equity options in a fast-growing AI company.
    - Remote-friendly work culture.
    - Opportunity to shape a product at the intersection of AI and human productivity.
    - Work with a passionate, senior team building cutting-edge tech for real-world business use.

    More
  • Β· 44 views Β· 0 applications Β· 17d

    Sales Executive (Google Cloud+Google Workspace)

    Full Remote Β· Czechia Β· Product Β· 2 years of experience Β· B2 - Upper Intermediate
    Cloudfresh is a Global Google Cloud Premier Partner, Zendesk Premier Partner, Asana Solutions Partner, GitLab Select Partner, Hubspot Platinum Partner, Okta Activate Partner, and Microsoft Partner. Since 2017, we’ve been specializing in the...

    Cloudfresh ⛅️ is a Global Google Cloud Premier Partner, Zendesk Premier Partner, Asana Solutions Partner, GitLab Select Partner, Hubspot Platinum Partner, Okta Activate Partner, and Microsoft Partner.

    Since 2017, we’ve been specializing in the implementation, migration, integration, audit, administration, support, and training for top-tier cloud solutions. Our products focus on cutting-edge cloud computing, advanced location and mapping, seamless collaboration from anywhere, unparalleled customer service, and innovative DevSecOps.

    We are seeking a dynamic Sales Executive to lead our sales efforts for GCP and GWS solutions across the EMEA and CEE regions. The ideal candidate will be a high-performing A-player with experience in SaaS sales, adept at navigating complex sales environments, and driven to exceed targets through strategic sales initiatives.

    Requirements:

    • Fluency in English and native Czech is essential;
    • From 2 years of proven sales experience in SaaS/ IaaS fields, with a documented history of achieving and exceeding sales targets, particularly in enterprise sales;
    • Sales experience on GCP and/or GWS specifically;
    • Sales or technical certifications related to Cloud Solutions are advantageous;
    • Experience in expanding new markets with outbound activities;
    • Excellent communication, negotiation, and strategic planning abilities;
    • Proficient in managing CRM systems and understanding their strategic importance in sales and customer relationship management.

    Responsibilities:

    • Develop and execute sales strategies for GCP and GWS solutions, targeting enterprise clients within the Cloud markets across EMEA and CEE;
    • Identify and penetrate new enterprise market segments, leveraging GCP and GWS to improve client outcomes;
    • Conduct high-level negotiations and presentations with major companies across Europe, focusing on the strategic benefits of adopting GCP and GWS solutions;
    • Work closely with marketing and business development teams to align sales strategies with broader company goals;
    • Continuously assess the competitive landscape and customer needs, adapting sales strategies to meet market demands and drive revenue growth.

    Work conditions:

    • Competitive Salary & Transparent Motivation: Receive a competitive base salary with commission on sales and performance-based bonuses, providing clear financial rewards for your success.
    • Flexible Work Format: Work remotely with flexible hours, allowing you to balance your professional and personal life efficiently.
    • Freedom to Innovate: Utilize multiple channels and approaches for sales, allowing you the freedom to find the best strategies for success.
    • Training with Leading Cloud Products: Access in-depth training on cutting-edge cloud solutions, enhancing your expertise and equipping you with the tools to succeed in an ever-evolving industry.
    • International Collaboration: Work alongside A-players and seasoned professionals in the cloud industry. Expand your expertise by engaging with international markets across the EMEA and CEE regions.
    • Vibrant Team Environment: Be part of an innovative, dynamic team that fosters both personal and professional growth, creating opportunities for you to advance in your career.
    • When applying to this position, you consent to the processing of your personal data by CLOUDFRESH for the purposes necessary to conduct the recruitment process, in accordance with Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016 (GDPR).
    • Additionally, you agree that CLOUDFRESH may process your personal data for future recruitment processes.
    More
  • Β· 37 views Β· 3 applications Β· 6d

    CloudOps Engineer

    Full Remote Β· EU Β· Product Β· 4 years of experience Β· B1 - Intermediate
    We are looking for a CloudOps Engineer to join our teams! Requirements: - 4+ years of experience with DevOps practices - 3+ years of experience in public cloud platforms (AWS, GCP, GCore etc) - Strong knowledge of Linux architecture and systems...

    We are looking for a CloudOps Engineer to join our teams!
     

    Requirements:

    - 4+ years of experience with DevOps practices
    - 3+ years of experience in public cloud platforms (AWS, GCP, GCore etc)
    - Strong knowledge of Linux architecture and systems implementation
    - Strong knowledge of IaC approach (Ansible, Terraform)
    - Strong scripting skills in Bash, Python, or other automation languages
    - Strong knowledge of Cloud based approach
    - Knowledge of Kubernetes management
    - Good understanding of networking concepts and protocols
    - Experience in microservices architecture, distributed systems, and scaling production environments.
    - Experience/awareness of automated DevOps activities, concepts, and toolsets.
    - Experience with AWS Control Tower, Config, IAM and other technologies that enable high-level administration
    - Experience building and maintaining CI/CD pipelines using tools like GitLab/GitHub CI
    - Experience with AWS CloudWatch, GCP Cloud Monitoring, Prometheus, Grafana for monitoring and log aggregation
    - Problem-solving and troubleshooting skills, ability to analyze complex systems and identify the causes of problems
    - Preferable experience with GCP Cloud Resource management, IAM, Organization policies and other technologies that enable high-level administration

     

    Will be plus:
    - AWS Certified SysOps Administrator
    - AWS Certified DevOps Engineer
    - GCP Certified Cloud Engineer
    - GCP Certified Cloud DevOps Engineer
    - Similar Public Cloud certificates

     

    Soft Skills:
    - Team player
    - Critical Thinking
    - Good communicator
    - Open to challenges and new opportunities
    - Thirst for knowledge
    - Time Management

     

    Responsibilities:
    - Support and evolution of the current public cloud infrastructure
    - Automating repetitive tasks and processes in public cloud infrastructure
    - Automation and improvement of current processes related to the administration and support of public clouds
    - Implementation of new providers of public cloud services
    - Collaborate with cross-functional teams to define cloud strategies, governance, and best practices.
    - Conduct architectural assessments and provide recommendations for optimizing existing public cloud environments

     

    Our benefits to you:
    ☘️An exciting and challenging job in a fast-growing holding, the opportunity to be part of a multicultural team of top professionals in Development, Architecture, Management, Operations, Marketing, Legal, Finance and more
    🀝🏻Great working atmosphere with passionate experts and leaders, sharing a friendly culture and a success-driven mindset is guaranteed
    πŸ§‘πŸ»β€πŸ’»Modern corporate equipment based on macOS or Windows and additional equipment are provided
    πŸ–Paid vacations, sick leave, personal events days, days off
    πŸ’΅Referral program β€” enjoy cooperation with your colleagues and get the bonus
    πŸ“šEducational programs: regular internal training sessions, compensation for external education, attendance of specialized global conferences
    🎯Rewards program for mentoring and coaching colleagues
    πŸ—£Free internal English courses
    ✈️In-house Travel Service 
    πŸ¦„Multiple internal activities: online platform for employees with quests, gamification, presents and news, PIN-UP clubs for movie / book / pets lovers and more
    🎳Other benefits could be added based on your location

    More
  • Β· 42 views Β· 1 application Β· 19d

    Senior Data Engineer

    Full Remote Β· Ukraine Β· 4 years of experience Β· B1 - Intermediate
    TJHelpers is committed to building a new generation of data specialists by combining mentorship, practical experience, and structured development through our β€œHelpers as a Service” model. We’re looking for a Senior Data Engineer to join our growing data...

    TJHelpers is committed to building a new generation of data specialists by combining mentorship, practical experience, and structured development through our β€œHelpers as a Service” model.

    We’re looking for a Senior Data Engineer to join our growing data team and help design, build, and optimize scalable data pipelines and infrastructure. You will work with cross-functional teams to ensure high-quality, reliable, and efficient data solutions that empower analytics, AI models, and business decision-making.
     

    Responsibilities

    • Design, implement, and maintain robust ETL/ELT pipelines for structured and unstructured data.
    • Build scalable data architectures using modern tools and cloud platforms (e.g., AWS, GCP, Azure).
    • Collaborate with data analysts, scientists, and engineers to deliver reliable data solutions.
    • Ensure data quality, lineage, and observability across all pipelines.
    • Optimize performance, scalability, and cost efficiency of data systems.
    • Mentor junior engineers and contribute to establishing best practices.
       

    Requirements

    • Strong proficiency in one or more programming languages for data engineering: Python, Java, Scala, or SQL.
    • Solid understanding of data modeling, warehousing, and distributed systems.
    • Experience with modern data frameworks (e.g., Apache Spark, Flink, Kafka, Airflow, dbt).
    • Familiarity with relational and NoSQL databases.
    • Good understanding of CI/CD, DevOps practices, and agile workflows.
    • Strong problem-solving skills and ability to work in cross-functional teams.
       

    Nice to Have

    • Experience with cloud data services (e.g., BigQuery, Snowflake, Redshift, Databricks).
    • Knowledge of containerization and orchestration (Docker, Kubernetes).
    • Exposure to data governance, security, and compliance frameworks.
    • Familiarity with ML/AI pipelines and MLOps practices.
       

    We Offer

    • Mentorship and collaboration with senior data architects and engineers.
    • Hands-on experience in designing and scaling data platforms.
    • Personal learning plan, internal workshops, and peer reviews.
    • Projects with real clients across fintech, healthcare, and AI-driven industries.
    • Clear growth path toward Lead Data Engineer and Data Architect roles.
    More
  • Β· 28 views Β· 0 applications Β· 12d

    Big Data Engineer

    Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· B2 - Upper Intermediate
    We are looking for a Data Engineer to build and optimize the data pipelines that fuel our Ukrainian LLM and Kyivstar’s NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata,...

    We are looking for a Data Engineer to build and optimize the data pipelines that fuel our Ukrainian LLM and Kyivstar’s NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling our data scientists and ML engineers to develop cutting-edge language models. You will work at the intersection of data engineering and machine learning, ensuring that our datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context. This is a unique opportunity to shape the data foundation of a pioneering AI project in Ukraine, working alongside NLP experts and leveraging modern big data technologies.

     

    What you will do

    • Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information. Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
    • Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to our language modeling efforts.
    • Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
    • Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
    • Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
    • Automate data processing workflows and ensure their scalability and reliability. Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
    • Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs. Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
    • Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models. Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
    • Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
    • Manage data security, access, and compliance. Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.

     

    Qualifications and experience needed

    • Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
    • NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given our project’s focus. Understanding of FineWeb2 or a similar processing pipeline approach.
    • Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
    • Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
    • Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as our NLP applications may require embedding storage and fast similarity search.
    • Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
    • Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
    • Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.

     

    A plus would be

    • Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
    • Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
    • CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
    • Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
    • Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimising existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve our workflows.

     

    What we offer

    • Office or remote – it’s up to you. You can work from anywhere, and we will arrange your workplace.
    • Remote onboarding.
    • Performance bonuses.
    • We train employees with the opportunity to learn through the company’s library, internal resources, and programs from partners.β€―  
    • Health and life insurance.  
    • Wellbeing program and corporate psychologist.  
    • Reimbursement of expenses for Kyivstar mobile communication.  
    More
  • Β· 35 views Β· 3 applications Β· 24d

    Data Solutions Architect

    Full Remote Β· Countries of Europe or Ukraine Β· 5 years of experience Β· C1 - Advanced
    We are looking for you! We are seeking a Data Solutions Architect with deep expertise in data platform design, AdTech systems integration, and data pipeline development for the advertising and media industry. This role requires strong technical knowledge...

    We are looking for you!

    We are seeking a Data Solutions Architect with deep expertise in data platform design, AdTech systems integration, and data pipeline development for the advertising and media industry. This role requires strong technical knowledge in both real-time and batch data processing, with hands-on experience in building scalable, high-performance data architectures across demand-side and sell-side platforms. 

    As a client-facing technical expert, you will play a key role in project delivery, presales process, technical workshops, and project kick-offs, ensuring that our clients receive best-in-class solutions tailored to their business needs. 

    Contract type: Gig contract.

    Skills and experience you can bring to this role

    Qualifications & experience:

    • 5+ years of experience in designing and implementing data architectures and pipelines for media and advertising industries  that align with business goals, ensure scalability, security, and performance;
    • Hands-on expertise with cloud-native and enterprise data platforms, including Snowflake, Databricks, and cloud-native warehousing solutions like AWS Redshift, Azure Synapse, or Google BigQuery;
    • Proficiency in Python, Scala, or Java for building data pipelines and ETL workflows;
    • Hands-on experience with data engineering tools and frameworks such as Apache Kafka, Apache Spark, Airflow, dbt, or Flink. Batch and stream processing architecture;
    • Experience working with and good understanding of relational and non-relational databases (SQL, NoSQL(document-oriented, key-value, columnar stores, etc.);
    • Experience in data modelling: Ability to create conceptual, logical, and physical data models;
    • Experience designing solutions for one or more cloud providers (AWS, GCP, Azure) and their data engineering services;
    • Experience in client-facing technical roles, including presales, workshops, and solutioning discussions;
    • Strong ability to communicate complex technical concepts to both technical and non-technical stakeholders. 

    Nice to have:

    • Experience working with AI and machine learning teams, integration of ML models into enterprise data pipelines: model fine-tuning, RAG, MLOps, LLMOps;
    • Knowledge of privacy-first architectures and data compliance standards in advertising (e.g., GDPR, CCPA);
    • Knowledge of data integration tools such as Apache Airflow, Talend, Informatica, and MuleSoft for connecting disparate systems;
    • Exposure to real-time bidding (RTB) systems and audience segmentation strategies.

    What impact you’ll make 

    • Architect and implement end-to-end data solutions for advertising and media clients, integrating with DSPs, SSPs, DMPs, CDPs, and other AdTech systems;
    • Design and optimize data platforms, ensuring efficient data ingestion, transformation, and storage for both batch and real-time processing;
    • Build scalable, secure, and high-performance data pipelines that handle large-scale structured and unstructured data from multiple sources;
    • Work closely with client stakeholders to define technical requirements, guide solution designs, and align data strategies with business goals;
    • Lead technical discovery sessions, workshops, and presales engagements, acting as a trusted technical advisor to clients;
    • Ensure data governance, security, and compliance best practices are implemented within the data architecture;
    • Collaborate with data science and machine learning teams, designing data pipelines that support model training, feature engineering, and analytics workflows.
    More
Log In or Sign Up to see all posted jobs