Jobs Data Engineer
111-
Β· 594 views Β· 56 applications Β· 6d
Data Engineer
Countries of Europe or Ukraine Β· 2 years of experience Β· B1 - IntermediateLooking for a Data Engineer to join the Dataforest team. If you are looking for a friendly team, a healthy working environment, and a flexible schedule β you have found the right place to send your CV. Skills requirements: β’ 2+ years of experience with...Looking for a Data Engineer to join the Dataforest team. If you are looking for a friendly team, a healthy working environment, and a flexible schedule β you have found the right place to send your CV.
Skills requirements:
β’ 2+ years of experience with Python;
β’ 2+ years of experience as a Data Engineer;
β’ Experience with Pandas;
β’ Experience with SQL DB / NoSQL (Redis, Mongo, Elasticsearch) / BigQuery;
β’ Familiarity with Amazon Web Services;
β’ Knowledge of data algorithms and data structures is a MUST;
β’ Working with high volume tables 10m+.
Optional skills (as a plus):
β’ Experience with Spark (pyspark);
β’ Experience with Airflow;
β’ Experience with Kafka;
β’ Experience in statistics;
β’ Knowledge of DS and Machine learning algorithms..Key responsibilities:
β’ Create ETL pipelines and data management solutions (API, Integration logic);
β’ Different data processing algorithms;
β’ Involvement in creation of forecasting, recommendation, and classification models.We offer:
β’ Great networking opportunities with international clients, challenging tasks;
β’ Building interesting projects from scratch using new technologies;
β’ Personal and professional development opportunities;
β’ Competitive salary fixed in USD;
β’ Paid vacation and sick leaves;
β’ Flexible work schedule;
β’ Friendly working environment with minimal hierarchy;
β’ Team building activities, corporate events.
More -
Β· 113 views Β· 21 applications Β· 16d
Data Engineer
Full Remote Β· Worldwide Β· 5 years of experience Β· B2 - Upper IntermediateLead the development and scaling of our scientific knowledge graphβingesting, structuring, and enriching massive datasets from research literature and global data sources into meaningful, AI-ready insights. Requirements: - Strong experience with...Lead the development and scaling of our scientific knowledge graphβingesting, structuring, and enriching massive datasets from research literature and global data sources into meaningful, AI-ready insights.
Requirements:
- Strong experience with knowledge graph design and implementation (Neo4j, RDFLib, GraphQL, etc.).
- Advanced Python for data engineering, ETL, and entity processing (Spark/Dask/Polars).
- Proven track record with large dataset ingestion (tens of millions of records).
- Familiarity with life-science or biomedical data (ontologies, research metadata, entity linking).
- Experience with Airflow/Dagster/dbt, and data APIs (OpenAlex, ORCID, PubMed).
- Strong sense of ownership, precision, and delivery mindset. Nice to Have:
- Domain knowledge in life sciences, biomedical research, or related data models.
- Experience integrating vector/semantic embeddings (Pinecone, FAISS, Weaviate).
We offer:
β’ Attractive financial package
β’ Challenging projects
β’ Professional & career growth
β’ Great atmosphere in a friendly small team
More -
Β· 57 views Β· 15 applications Β· 17d
Senior Data Engineer β (PySpark / Data Infrastructure)
Full Remote Β· Worldwide Β· Product Β· 5 years of experience Β· C1 - AdvancedSenior Data Engineer β (PySpark / Data Infrastructure) We're hiring a Senior Data Engineer to help lead the next phase of our data platformβs growth. At Forecasa, we provide enriched real estate transaction data and analytics to private lenders and...Senior Data Engineer β (PySpark / Data Infrastructure)
We're hiring a Senior Data Engineer to help lead the next phase of our data platformβs growth.
At Forecasa, we provide enriched real estate transaction data and analytics to private lenders and investors. Our platform processes large volumes of public data, standardizes and enriches it, and delivers actionable insights that drive lending decisions.
We recently completed a migration from a legacy SQL-based ETL stack (PostgreSQL/dbt) to PySpark, and we're now looking for a senior engineer to take ownership of the new pipeline, maintain and optimize it, and develop new data-driven features to support our customers and internal analytics.
What Youβll Do
- Own and maintain our PySpark-based data pipeline, ensuring stability, performance, and scalability.
- Design and build new data ingestion, transformation, and validation workflows.
- Optimize and monitor data jobs using Airflow, Kubernetes, and S3.
- Collaborate with data analysts, product owners, and leadership to define data needs and deliver clean, high-quality data.
- Support and mentor junior engineers working on scrapers, validation tools, and quality monitoring dashboards.
- Contribute to the evolution of our data infrastructure and architectural decisions.
Our Tech Stack
Python β’ PySpark β’ PostgreSQL β’ dbt β’ Airflow β’ S3 β’ Kubernetes β’ GitLab β’ Grafana
What Weβre Looking For
- 5+ years of experience in data engineering or backend systems with large-scale data processing.
- Strong experience with PySpark, including building scalable data pipelines and working with large datasets.
- Solid command of SQL, data modeling, and performance tuning (especially in PostgreSQL).
- Experience working with orchestration tools like Airflow, and containers via Docker/Kubernetes.
- Familiarity with cloud storage (preferably S3) and modern CI/CD workflows.
- Ability to work independently and communicate clearly in a remote, async-first environment.
Bonus Points
- Background in real estate or financial data
- Experience with data quality frameworks or observability tools (e.g., Great Expectations, Grafana, Prometheus)
- Experience optimizing PySpark jobs for performance and cost-efficiency
-
Β· 13 views Β· 1 application Β· 12d
Presale engineer
Full Remote Β· Ukraine Β· Product Β· 2 years of experience Β· A2 - ElementaryRequirements: Knowledge of the core functionality of virtualization platforms; Experience implementing and migrating workloads in virtualized environment; Experience in complex IT solutions and Hybrid Cloud solution projects. Good understanding of...Requirements:
- Knowledge of the core functionality of virtualization platforms;
- Experience implementing and migrating workloads in virtualized environment;
- Experience in complex IT solutions and Hybrid Cloud solution projects.
- Good understanding of IT-infrastructure services is a plus;
- Strong knowledge in troubleshooting of complex environments in case of failure;
- At least basic knowledge in networking & information security is an advantage
- Hyper-V, Proxmox, VMWare experience would be an advantage;
- Experience in the area of services outsourcing (as customer and/or provider) is an advantage.
- Work experience of 2+ years in a similar position
- Scripting and programming experience/background in PowerShell/Bash is an advantage;
- Strong team communication skills, both verbal and written;
- Experience in technical documentation writing and preparation;
- English skills - intermediate level is minimum and mandatory for global teams communication;
- Industry certification focused on relevant solution area.
Areas of Responsibility includes:
- Participating in deployment and IT-infrastructure migration projects, Hybrid Cloud solution projects; Client support;
- Consulting regarding migration IT-workloads in complex infrastructures;
- Presales support (Articulating service value in the sales process) / Up and cross sell capability);
- Project documentation: technical concepts
- Education and development in professional area including necessary certifications.
-
Β· 90 views Β· 4 applications Β· 30d
Data Engineer
Full Remote Β· Countries of Europe or Ukraine Β· 5 years of experience Β· B2 - Upper Intermediate MilTech πͺWho We Are OpenMinds is a cognitive defence tech company countering authoritarian influence in the battle for free and open societies. We work with over 30 governments and organisations worldwide, including Ukraine, the UK, and NATO member governments,...Who We Are
OpenMinds is a cognitive defence tech company countering authoritarian influence in the battle for free and open societies. We work with over 30 governments and organisations worldwide, including Ukraine, the UK, and NATO member governments, leading StratCom agencies, and research institutions.
Our expertise lies in accessing restricted and high-risk environments, including conflict zones and closed platforms.
We combine ML technologies with deep local expertise. Our team, based in Kyiv, Lviv, London, Ottawa, and Washington, DC, includes behavioural scientists, ML/AI engineers, data journalists, communications experts, and regional specialists.
Our core values are: speed, experimentation, elegance and focus. We are expanding the team and welcome passionate, proactive, and resourceful professionals who are eager to contribute to the global fight in cognitive warfare.
Who weβre looking for
OpenMinds is seeking a skilled and curious Data Engineer whoβs excited to design and build data systems that power meaningful insight. Youβll work closely with a passionate team of behavioral scientists and ML engineers on creating a robust data infrastructure that supports everything from large-scale narrative tracking to sentiment analysis.
In the position you will:
- Take ownership of our multi-terabyte data infrastructure, from data ingestion and orchestration to transformation, storage, and lifecycle management
- Collaborate with data scientists, analysts, ML engineers, and domain experts to develop impactful data solutions
- Optimize and troubleshoot data infrastructure to ensure high performance, cost-efficiency, scalability, and resilience
- Stay up-to-date with trends in data engineering and apply modern tools and practices
- Define and implement best practices for data processing, storage, and governance
- Translate complex requirements into efficient data workflows that support threat detection and response
We are a perfect match if you have:
- 5+ years of hands-on experience as a Data Engineer, with a proven track record of leading complex data projects from design to production
- Highly skilled in SQL and Python for advanced data processing, pipeline development, and optimization
- Deep understanding of software engineering best practices, including SOLID, error handling, observability, performance tuning, and modular architecture
- Ability to write, test and deploy production-ready code
- Extensive experience in database design, data modeling, and modern data warehousing, including ETL orchestration using Airflow or equivalent
- Familiarity with Google Cloud Platform (GCP) and its data ecosystem (BigQuery, GCS, Pub/Sub, Cloud Run, Cloud Functions, Looker)
- Open-headed, capable of coming up with creative solutions and adapting to frequently changing circumstances and technological advances
- Experience in DevOps (docker/k8s, IaaC, CI/CD) and MLOps
- Fluent in English with excellent communication and cross-functional collaboration skills
We offer:
- Work in a fast-growing company with proprietary AI technologies, solving the most difficult problems in the domains of social behaviour analytics and national security
- Competitive market salary
- Opportunity to present your work on tier 1 conferences, panels, and briefings behind closed doors
- Work face-to-face with world-leading experts in their fields, who are our partners and friends
- Flexible work arrangements, including adjustable hours, location, and remote/hybrid options
- Unlimited vacation and leave policies
- Opportunities for professional development within a multidisciplinary team, boasting experience from academia, tech, and intelligence sectors
- A work culture that values resourcefulness, proactivity, and independence, with a firm stance against micromanagement
-
Β· 24 views Β· 5 applications Β· 30d
Senior ML/GenAI Engineer
Full Remote Β· Ukraine Β· Product Β· 5 years of experience Β· B2 - Upper IntermediateSenior ML Engineer Full-time / Remote About Us ExpoPlatform is a UK-based company founded in 2013, delivering advanced technology for online, hybrid, and in-person events across 30+ countries. Our platform provides end-to-end solutions for event...Senior ML Engineer
Full-time / Remote
About Us
ExpoPlatform is a UK-based company founded in 2013, delivering advanced technology for online, hybrid, and in-person events across 30+ countries. Our platform provides end-to-end solutions for event organizers, including registration, attendee management, event websites, and networking tools.
Role Responsibilities:
- Develop AI Agents, tools for AI Agents, API as a service
- Prepare development and deployment documentation
- Participate in R&D activities of Data Science team
Required Skills & Experience:
- 5+ years of experience with DL frameworks (PyTorch and/or TensorFlow)
- 5+ years of experience in software development in Python
- Hand-on experience with LLM, RAG and AI Agents development
- Experience with Amazon SageMaker, Amazon Bedrock, LangChain, LangGraph, LangSmith, LlamaIndex, HaggingFace, OpenAI
- Hand-on experience of usage AI tools for software development to increase efficiency and code quality, usage AI tools for code review.
- Knowledge of SQL, non-SQL and vector databases
- Understanding of embedding vectors and semantic search
- Proficiency in Git (Bitbucket) and Docker
- Upper-Intermediate (B2+) or higher level of English
Would a Plus:
- Hand-on experience with SLM and LLM fine-tuning
- Education in Data Science, Computer Science, Applied Math or similar
- AWS certifications (AWS Certified ML or equivalent)
- Experience with TypeSense
- Experience with speech recognition, speech-to-text ML models
What We Offer:
- Career growth with an international team.
- Competitive salary and financial stability.
- Flexible working hours (Mon-Fri, 8 hours).
- Free English courses and a budget for education
More
-
Β· 63 views Β· 12 applications Β· 12d
Data Engineer
Full Remote Β· Countries of Europe or Ukraine Β· 4 years of experience Β· B2 - Upper IntermediateWe are seeking a talented and experienced Data Engineer to join our professional services team of 50+ engineers, on a full-time basis. This remote-first position requires in-depth expertise in data engineering, with a preference for experience in cloud...We are seeking a talented and experienced Data Engineer to join our professional services team of 50+ engineers, on a full-time basis. This remote-first position requires in-depth expertise in data engineering, with a preference for experience in cloud platforms like AWS, Google Cloud. You will play a vital role in ensuring the performance, efficiency, and integrity of data pipelines of our customers while contributing to insightful data analysis and utilization.
About us:Opsfleet is a boutique services company who specializes in cloud infrastructure, data, AI, and humanβbehavior analytics to help organizations make smarter decisions and boost performance.
Our experts provide endβtoβend solutionsβfrom data engineering and advanced analytics to DevOpsβensuring scalable, secure, and AIβready platforms that turn insights into action.
Role Overview
As a Data Engineer at Opsfleet, you will lead the entire data lifecycleβgathering and translating business requirements, ingesting and integrating diverse data sources, and designing, building, and orchestrating robust ETL/ELT pipelines with builtβin quality checks, governance, and observability. Youβll partner with data scientists to prepare, deploy, and monitor ML/AI models in production, and work closely with analysts and stakeholders to transform raw data into actionable insights and scalable intelligence.
What Youβll Do
* E2E Solution Delivery: Lead the full spectrum of data projectsβrequirements gathering, data ingestion, modeling, validation, and production deployment.
* Data Modeling: Develop and maintain robust logical and physical data modelsβsuch as star and snowflake schemasβto support analytics, reporting, and scalable data architectures.
* Data Analysis & BI: Transform complex datasets into clear, actionable insights; develop dashboards and reports that drive operational efficiency and revenue growth.
* ML Engineering: Implement and manage modelβserving pipelines using cloudβs MLOps toolchain, ensuring reliability and monitoring in production.
* Collaboration & Research: Partner with crossβfunctional teams to prototype solutions, identify new opportunities, and drive continuous improvement.
What Weβre Looking For
Experience: 4+ years in a dataβfocused role (Data Engineer, BI Developer, or similar)
Technical Skills: Proficient in SQL and Python for data manipulation, cleaning, transformation, and ETL workflows. Strong understanding of statistical methods and data modeling concepts. Soft Skills: Excellent problemβsolving ability, critical thinking, and attention to detail. Outstanding written and verbal communication.
Education: BSc or higher in Mathematics, Statistics, Engineering, Computer Science, Life Science, or a related quantitative discipline.
Nice to Have
Cloud & Data Warehousing: Handsβon experience with cloud platforms (GCP, AWS or others) and modern data warehouses such as BigQuery and Snowflake.
More
-
Β· 24 views Β· 2 applications Β· 18d
Infrastructure Engineer
Full Remote Β· Countries of Europe or Ukraine Β· 5 years of experience Β· C1 - AdvancedWe are looking for a Senior Infrastructure Engineer to manage and improve our IT systems and cloud environments. Youβll work closely with DevOps and security teams to ensure system availability and reliability. Details: Experience: 5 years Schedule:...We are looking for a Senior Infrastructure Engineer to manage and improve our IT systems and cloud environments. Youβll work closely with DevOps and security teams to ensure system availability and reliability.
Details:
Experience: 5 years
Schedule: Full time, remote
Start: ASAP
English: Fluent
Employment: B2B ContractResponsibilities:
- Design, deploy, and manage infrastructure environments
- Automate deployments using Terraform, Ansible, etc.
- Monitor and improve system performance and availability
- Implement disaster recovery plans
- Support troubleshooting across environments
Requirements:
- Strong Linux administration background
- Experience with AWS, GCP, or Azure
- Proficiency with containerization tools (Docker, Kubernetes)
- Infrastructure as Code (IaC) using Terraform or similar
- Scripting skills in Python, Bash, etc.
-
Β· 18 views Β· 0 applications Β· 5d
Data Engineer (NLP-Focused)
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· B1 - IntermediateAbout us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...About us:
More
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.
About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.
About the role:
We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.
You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.
Requirements:
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the projectβs focus.
Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.
Responsibilities:
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
- Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability.
- Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
- Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
- Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance.
- Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.
The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use. -
Β· 44 views Β· 0 applications Β· 17d
Sales Executive (Google Cloud+Google Workspace)
Full Remote Β· Czechia Β· Product Β· 2 years of experience Β· B2 - Upper IntermediateCloudfresh is a Global Google Cloud Premier Partner, Zendesk Premier Partner, Asana Solutions Partner, GitLab Select Partner, Hubspot Platinum Partner, Okta Activate Partner, and Microsoft Partner. Since 2017, weβve been specializing in the...Cloudfresh β οΈ is a Global Google Cloud Premier Partner, Zendesk Premier Partner, Asana Solutions Partner, GitLab Select Partner, Hubspot Platinum Partner, Okta Activate Partner, and Microsoft Partner.
Since 2017, weβve been specializing in the implementation, migration, integration, audit, administration, support, and training for top-tier cloud solutions. Our products focus on cutting-edge cloud computing, advanced location and mapping, seamless collaboration from anywhere, unparalleled customer service, and innovative DevSecOps.
We are seeking a dynamic Sales Executive to lead our sales efforts for GCP and GWS solutions across the EMEA and CEE regions. The ideal candidate will be a high-performing A-player with experience in SaaS sales, adept at navigating complex sales environments, and driven to exceed targets through strategic sales initiatives.
Requirements:
- Fluency in English and native Czech is essential;
- From 2 years of proven sales experience in SaaS/ IaaS fields, with a documented history of achieving and exceeding sales targets, particularly in enterprise sales;
- Sales experience on GCP and/or GWS specifically;
- Sales or technical certifications related to Cloud Solutions are advantageous;
- Experience in expanding new markets with outbound activities;
- Excellent communication, negotiation, and strategic planning abilities;
- Proficient in managing CRM systems and understanding their strategic importance in sales and customer relationship management.
Responsibilities:
- Develop and execute sales strategies for GCP and GWS solutions, targeting enterprise clients within the Cloud markets across EMEA and CEE;
- Identify and penetrate new enterprise market segments, leveraging GCP and GWS to improve client outcomes;
- Conduct high-level negotiations and presentations with major companies across Europe, focusing on the strategic benefits of adopting GCP and GWS solutions;
- Work closely with marketing and business development teams to align sales strategies with broader company goals;
- Continuously assess the competitive landscape and customer needs, adapting sales strategies to meet market demands and drive revenue growth.
Work conditions:
- Competitive Salary & Transparent Motivation: Receive a competitive base salary with commission on sales and performance-based bonuses, providing clear financial rewards for your success.
- Flexible Work Format: Work remotely with flexible hours, allowing you to balance your professional and personal life efficiently.
- Freedom to Innovate: Utilize multiple channels and approaches for sales, allowing you the freedom to find the best strategies for success.
- Training with Leading Cloud Products: Access in-depth training on cutting-edge cloud solutions, enhancing your expertise and equipping you with the tools to succeed in an ever-evolving industry.
- International Collaboration: Work alongside A-players and seasoned professionals in the cloud industry. Expand your expertise by engaging with international markets across the EMEA and CEE regions.
- Vibrant Team Environment: Be part of an innovative, dynamic team that fosters both personal and professional growth, creating opportunities for you to advance in your career.
- When applying to this position, you consent to the processing of your personal data by CLOUDFRESH for the purposes necessary to conduct the recruitment process, in accordance with Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016 (GDPR).
- Additionally, you agree that CLOUDFRESH may process your personal data for future recruitment processes.
-
Β· 37 views Β· 3 applications Β· 6d
CloudOps Engineer
Full Remote Β· EU Β· Product Β· 4 years of experience Β· B1 - IntermediateWe are looking for a CloudOps Engineer to join our teams! Requirements: - 4+ years of experience with DevOps practices - 3+ years of experience in public cloud platforms (AWS, GCP, GCore etc) - Strong knowledge of Linux architecture and systems...We are looking for a CloudOps Engineer to join our teams!
Requirements:
- 4+ years of experience with DevOps practices
- 3+ years of experience in public cloud platforms (AWS, GCP, GCore etc)
- Strong knowledge of Linux architecture and systems implementation
- Strong knowledge of IaC approach (Ansible, Terraform)
- Strong scripting skills in Bash, Python, or other automation languages
- Strong knowledge of Cloud based approach
- Knowledge of Kubernetes management
- Good understanding of networking concepts and protocols
- Experience in microservices architecture, distributed systems, and scaling production environments.
- Experience/awareness of automated DevOps activities, concepts, and toolsets.
- Experience with AWS Control Tower, Config, IAM and other technologies that enable high-level administration
- Experience building and maintaining CI/CD pipelines using tools like GitLab/GitHub CI
- Experience with AWS CloudWatch, GCP Cloud Monitoring, Prometheus, Grafana for monitoring and log aggregation
- Problem-solving and troubleshooting skills, ability to analyze complex systems and identify the causes of problems
- Preferable experience with GCP Cloud Resource management, IAM, Organization policies and other technologies that enable high-level administrationWill be plus:
- AWS Certified SysOps Administrator
- AWS Certified DevOps Engineer
- GCP Certified Cloud Engineer
- GCP Certified Cloud DevOps Engineer
- Similar Public Cloud certificatesSoft Skills:
- Team player
- Critical Thinking
- Good communicator
- Open to challenges and new opportunities
- Thirst for knowledge
- Time ManagementResponsibilities:
- Support and evolution of the current public cloud infrastructure
- Automating repetitive tasks and processes in public cloud infrastructure
- Automation and improvement of current processes related to the administration and support of public clouds
- Implementation of new providers of public cloud services
- Collaborate with cross-functional teams to define cloud strategies, governance, and best practices.
- Conduct architectural assessments and provide recommendations for optimizing existing public cloud environmentsOur benefits to you:
More
βοΈAn exciting and challenging job in a fast-growing holding, the opportunity to be part of a multicultural team of top professionals in Development, Architecture, Management, Operations, Marketing, Legal, Finance and more
π€π»Great working atmosphere with passionate experts and leaders, sharing a friendly culture and a success-driven mindset is guaranteed
π§π»βπ»Modern corporate equipment based on macOS or Windows and additional equipment are provided
πPaid vacations, sick leave, personal events days, days off
π΅Referral program β enjoy cooperation with your colleagues and get the bonus
πEducational programs: regular internal training sessions, compensation for external education, attendance of specialized global conferences
π―Rewards program for mentoring and coaching colleagues
π£Free internal English courses
βοΈIn-house Travel Service
π¦Multiple internal activities: online platform for employees with quests, gamification, presents and news, PIN-UP clubs for movie / book / pets lovers and more
π³Other benefits could be added based on your location -
Β· 42 views Β· 1 application Β· 19d
Senior Data Engineer
Full Remote Β· Ukraine Β· 4 years of experience Β· B1 - IntermediateTJHelpers is committed to building a new generation of data specialists by combining mentorship, practical experience, and structured development through our βHelpers as a Serviceβ model. Weβre looking for a Senior Data Engineer to join our growing data...TJHelpers is committed to building a new generation of data specialists by combining mentorship, practical experience, and structured development through our βHelpers as a Serviceβ model.
Weβre looking for a Senior Data Engineer to join our growing data team and help design, build, and optimize scalable data pipelines and infrastructure. You will work with cross-functional teams to ensure high-quality, reliable, and efficient data solutions that empower analytics, AI models, and business decision-making.
Responsibilities
- Design, implement, and maintain robust ETL/ELT pipelines for structured and unstructured data.
- Build scalable data architectures using modern tools and cloud platforms (e.g., AWS, GCP, Azure).
- Collaborate with data analysts, scientists, and engineers to deliver reliable data solutions.
- Ensure data quality, lineage, and observability across all pipelines.
- Optimize performance, scalability, and cost efficiency of data systems.
- Mentor junior engineers and contribute to establishing best practices.
Requirements
- Strong proficiency in one or more programming languages for data engineering: Python, Java, Scala, or SQL.
- Solid understanding of data modeling, warehousing, and distributed systems.
- Experience with modern data frameworks (e.g., Apache Spark, Flink, Kafka, Airflow, dbt).
- Familiarity with relational and NoSQL databases.
- Good understanding of CI/CD, DevOps practices, and agile workflows.
- Strong problem-solving skills and ability to work in cross-functional teams.
Nice to Have
- Experience with cloud data services (e.g., BigQuery, Snowflake, Redshift, Databricks).
- Knowledge of containerization and orchestration (Docker, Kubernetes).
- Exposure to data governance, security, and compliance frameworks.
- Familiarity with ML/AI pipelines and MLOps practices.
We Offer
- Mentorship and collaboration with senior data architects and engineers.
- Hands-on experience in designing and scaling data platforms.
- Personal learning plan, internal workshops, and peer reviews.
- Projects with real clients across fintech, healthcare, and AI-driven industries.
- Clear growth path toward Lead Data Engineer and Data Architect roles.
-
Β· 28 views Β· 0 applications Β· 12d
Big Data Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· B2 - Upper IntermediateWe are looking for a Data Engineer to build and optimize the data pipelines that fuel our Ukrainian LLM and Kyivstarβs NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata,...We are looking for a Data Engineer to build and optimize the data pipelines that fuel our Ukrainian LLM and Kyivstarβs NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling our data scientists and ML engineers to develop cutting-edge language models. You will work at the intersection of data engineering and machine learning, ensuring that our datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context. This is a unique opportunity to shape the data foundation of a pioneering AI project in Ukraine, working alongside NLP experts and leveraging modern big data technologies.
What you will do
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information. Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to our language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability. Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs. Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models. Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance. Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.
Qualifications and experience needed
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given our projectβs focus. Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as our NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.
A plus would be
- Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
- Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
- CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
- Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
- Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimising existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve our workflows.
What we offer
- Office or remote β itβs up to you. You can work from anywhere, and we will arrange your workplace.
- Remote onboarding.
- Performance bonuses.
- We train employees with the opportunity to learn through the companyβs library, internal resources, and programs from partners.β―
- Health and life insurance.
- Wellbeing program and corporate psychologist.
- Reimbursement of expenses for Kyivstar mobile communication.
-
Β· 35 views Β· 3 applications Β· 24d
Data Solutions Architect
Full Remote Β· Countries of Europe or Ukraine Β· 5 years of experience Β· C1 - AdvancedWe are looking for you! We are seeking a Data Solutions Architect with deep expertise in data platform design, AdTech systems integration, and data pipeline development for the advertising and media industry. This role requires strong technical knowledge...We are looking for you!
We are seeking a Data Solutions Architect with deep expertise in data platform design, AdTech systems integration, and data pipeline development for the advertising and media industry. This role requires strong technical knowledge in both real-time and batch data processing, with hands-on experience in building scalable, high-performance data architectures across demand-side and sell-side platforms.
As a client-facing technical expert, you will play a key role in project delivery, presales process, technical workshops, and project kick-offs, ensuring that our clients receive best-in-class solutions tailored to their business needs.
Contract type: Gig contract.
Skills and experience you can bring to this role
Qualifications & experience:
- 5+ years of experience in designing and implementing data architectures and pipelines for media and advertising industries that align with business goals, ensure scalability, security, and performance;
- Hands-on expertise with cloud-native and enterprise data platforms, including Snowflake, Databricks, and cloud-native warehousing solutions like AWS Redshift, Azure Synapse, or Google BigQuery;
- Proficiency in Python, Scala, or Java for building data pipelines and ETL workflows;
- Hands-on experience with data engineering tools and frameworks such as Apache Kafka, Apache Spark, Airflow, dbt, or Flink. Batch and stream processing architecture;
- Experience working with and good understanding of relational and non-relational databases (SQL, NoSQL(document-oriented, key-value, columnar stores, etc.);
- Experience in data modelling: Ability to create conceptual, logical, and physical data models;
- Experience designing solutions for one or more cloud providers (AWS, GCP, Azure) and their data engineering services;
- Experience in client-facing technical roles, including presales, workshops, and solutioning discussions;
- Strong ability to communicate complex technical concepts to both technical and non-technical stakeholders.
Nice to have:
- Experience working with AI and machine learning teams, integration of ML models into enterprise data pipelines: model fine-tuning, RAG, MLOps, LLMOps;
- Knowledge of privacy-first architectures and data compliance standards in advertising (e.g., GDPR, CCPA);
- Knowledge of data integration tools such as Apache Airflow, Talend, Informatica, and MuleSoft for connecting disparate systems;
- Exposure to real-time bidding (RTB) systems and audience segmentation strategies.
What impact youβll make
- Architect and implement end-to-end data solutions for advertising and media clients, integrating with DSPs, SSPs, DMPs, CDPs, and other AdTech systems;
- Design and optimize data platforms, ensuring efficient data ingestion, transformation, and storage for both batch and real-time processing;
- Build scalable, secure, and high-performance data pipelines that handle large-scale structured and unstructured data from multiple sources;
- Work closely with client stakeholders to define technical requirements, guide solution designs, and align data strategies with business goals;
- Lead technical discovery sessions, workshops, and presales engagements, acting as a trusted technical advisor to clients;
- Ensure data governance, security, and compliance best practices are implemented within the data architecture;
- Collaborate with data science and machine learning teams, designing data pipelines that support model training, feature engineering, and analytics workflows.
-
Β· 23 views Β· 1 application Β· 26d
Senior Data Engineer (IRC278988)
Full Remote Β· Ukraine Β· 5 years of experience Β· B2 - Upper IntermediateJob Description Strong hands-on experience with Azure Databricks (DLT Pipelines, Lakeflow Connect, Delta Live Tables, Unity Catalog, Time Travel, Delta Share) for large-scale data processing and analytics Proficiency in data engineering with Apache Spark,...Job Description
- Strong hands-on experience with Azure Databricks (DLT Pipelines, Lakeflow Connect, Delta Live Tables, Unity Catalog, Time Travel, Delta Share) for large-scale data processing and analytics
- Proficiency in data engineering with Apache Spark, using PySpark, Scala, or Java for data ingestion, transformation, and processing
- Proven expertise in the Azure data ecosystem: Databricks, ADLS Gen2, Azure SQL, Azure Blob Storage, Azure Key Vault, Azure Service Bus/Event Hub, Azure Functions, Azure Data Factory, and Azure CosmosDB
- Solid understanding of Lakehouse architecture, Modern Data Warehousing, and Delta Lake concepts
- Experience designing and maintaining config-driven ETL/ELT pipelines with support for Change Data Capture (CDC) and event/stream-based processing
- Proficiency with RDBMS (MS SQL, MySQL, PostgreSQL) and NoSQL databases
- Strong understanding of data modeling, schema design, and database performance optimization
- Practical experience working with various file formats, including JSON, Parquet, and ORC
- Familiarity with machine learning and AI integration within the data platform context
- Hands-on experience building and maintaining CI/CD pipelines (Azure DevOps, GitLab) and automating data workflow deployments
- Solid understanding of data governance, lineage, and cloud security (Unity Catalog, encryption, access control)
- Strong analytical and problem-solving skills with attention to detail
- Excellent teamwork and communication skills
- Upper-Intermediate English (spoken and written)
Job Responsibilities
- Design, implement, and optimize scalable and reliable data pipelines using Databricks, Spark, and Azure data services
- Develop and maintain config-driven ETL/ELT solutions for both batch and streaming data
- Ensure data governance, lineage, and compliance using Unity Catalog and Azure Key Vault
- Work with Delta tables, Delta Lake, and Lakehouse architecture to ensure efficient, reliable, and performant data processing
- Collaborate with developers, analysts, and data scientists to deliver trusted datasets for reporting, analytics, and machine learning use cases
- Integrate data pipelines with event-based and microservice architectures leveraging Service Bus, Event Hub, and Functions
- Design and maintain data models and schemas optimized for analytical and operational workloads
- Identify and resolve performance bottlenecks, ensuring cost efficiency and maintainability of data workflows
- Participate in architecture discussions, backlog refinement, estimation, and sprint planning
- Contribute to defining and maintaining best practices, coding standards, and quality guidelines for data engineering
- Perform code reviews, provide technical mentorship, and foster knowledge sharing within the team
- Continuously evaluate and enhance data engineering tools, frameworks, and processes in the Azure environment
Department/Project Description
GlobalLogic is searching for a motivated, results-driven, and innovative software engineer to join our project team at a dynamic startup specializing in pet insurance. Our client is a leading global holding company that is dedicated to developing an advanced pet insurance claims clearing solution designed to expedite and simplify the veterinary invoice reimbursement process for pet owners.
More
You will be working on a cutting-edge system built from scratch, leveraging Azure cloud services and adopting a low-code paradigm. The project adheres to industry best practices in quality assurance and project management, aiming to deliver exceptional results.
We are looking for an engineer who thrives in collaborative, supportive environments and is passionate about making a meaningful impact on people's lives. If you are enthusiastic about building innovative solutions and contributing to a cause that matters, this role could be an excellent fit for you.