Adaptiq

· 27 views · 3 applications · 13d

Senior Data Platform Engineer (Python, AWS)

Full Remote · Poland, Ukraine · Product · 5 years of experience · B2 - Upper Intermediate
About the Product: Our client - Finaloop - reshapes bookkeeping to fit the e-commerce needs, building a fully automated, real-time accounting platform that replaces traditional bookkeeping for e-commerce and DTC brands. That means handling vast volumes...
About the Product:
Our client - Finaloop - reshapes bookkeeping to fit the e-commerce needs, building a fully automated, real-time accounting platform that replaces traditional bookkeeping for e-commerce and DTC brands. That means handling vast volumes of financial data with precision, scale, and zero margin for error.
To support this, we’re investing in our platform’s core infrastructure, the foundation that powers real-time financial insight across thousands of businesses globally.

About the Role
We're seeking an outstanding and passionate Senior Platform Data Engineer to take part in shaping Finaloop's data infrastructure at the forefront of Fintech and AI.
You'll join a high-impact R&D team in a fast-paced startup environment, building scalable pipelines and robust data systems that empower eCommerce businesses to make smarter decisions.

Key Responsibilities:
- Designing, building, and maintaining scalable data pipelines and ETL processes for our financial data platform
- Developing and optimizing data infrastructure to support real-time analytics and reporting
- Implementing data governance, security, and privacy controls to ensure data quality and compliance
- Creating and maintaining documentation for data platforms and processes
- Collaborating with data scientists and analysts to deliver actionable insights to our customers
- Troubleshooting and resolving data infrastructure issues efficiently
- Monitoring system performance and implementing optimizations
- Staying current with emerging technologies and implementing innovative solutions
Required Competence and Skills:
- 5+ years experience in Data Engineering or Platform Engineering roles
- Strong programming skills in Python and SQL
- Experience with orchestration platforms and tools (Airflow, Dagster, Temporal or similar)
- Experience with MPP platforms (e.g., Snowflake, Redshift, Databricks)
- Hands-on experience with cloud platforms (AWS) and their data services
- Understanding of data modeling, data warehousing, and data lake concepts
- Ability to optimize data infrastructure for performance and reliability
- Experience working with containerization (Docker) environments
- Familiarity with CI/CD concepts and principles
- Fluent English (written and spoken)
Nice to have skills:
- Experience with big data processing frameworks (Apache Spark, Hadoop)
- Experience with stream processing technologies (Flink, Kafka, Kinesis)
- Knowledge of infrastructure as code (Terraform), with Kubernetes
- Experience building analytics platforms or clickstream pipelines
- Familiarity with ML workflows and MLOps
- Experience working in a startup environment or fintech industry
The main components of our current technology stack:
- AWS Serverless, Python, Airflow, Airbyte, Temporal, PostgreSQL, Snowflake, Kubernetes, Terraform, Docker.
More
N-iX

· 19 views · 0 applications · 19 September

Lead Big Data Engineer

Full Remote · Ukraine · 6 years of experience · B2 - Upper Intermediate
Role Overview: As a Lead Big Data Engineer, you will combine hands-on engineering with technical leadership. You’ll be responsible for designing, developing, and optimizing Spark-based big data pipelines in Palantir Foundry, ensuring high performance,...
Role Overview:
As a Lead Big Data Engineer, you will combine hands-on engineering with technical leadership. You’ll be responsible for designing, developing, and optimizing Spark-based big data pipelines in Palantir Foundry, ensuring high performance, scalability, and reliability. You will also mentor and manage a team of engineers, driving best practices in big data engineering, ensuring delivery excellence, and collaborating with stakeholders to meet business needs. While our project uses Palantir Foundry, prior experience with it is a plus, but not mandatory.
Key Responsibilities:
- Lead the design, development, and optimization of large-scale, Spark-based (PySpark) data processing pipelines.
- Build and maintain big data solutions using Palantir Foundry
- Ensure Spark workloads are tuned for performance and cost efficiency.
- Oversee and participate in code reviews, architecture discussions, and best practice implementation.
- Maintain high standards for data quality, security, and governance.
- Manage and mentor a team of Big Data Engineers, providing technical direction
- Drive continuous improvement in processes, tools, and development practices.
- Foster collaboration across engineering, data science, and product teams to align on priorities and solutions.
Requirements:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 6+ years in Big Data Engineering, with at least 1-2 years in a lead (tech/team lead) role.
- Deep hands-on expertise in Apache Spark (PySpark) for large-scale data processing.
- Proficiency in Python and distributed computing principles.
- Experience designing, implementing, and optimizing high-volume, low-latency data pipelines.
- Strong leadership, communication, and stakeholder management skills.
- Experience with Palantir Foundry is a plus, but not required.
- Familiarity with CI/CD and infrastructure as code (Terraform, CloudFormation) is desirable.
We offer*:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
*not applicable for freelancers
More
Automat-it

· 127 views · 31 applications · 13d

Senior Data Engineer

Full Remote · Ukraine · 5 years of experience · B2 - Upper Intermediate
Automat-it is where high-growth startups turn when they need to move faster, scale smarter, and make the most of the cloud. As an AWS Premier Partner and Strategic Partner, we deliver hands-on DevOps, FinOps, and GenAI support that drives real results. ...
Automat-it is where high-growth startups turn when they need to move faster, scale smarter, and make the most of the cloud. As an AWS Premier Partner and Strategic Partner, we deliver hands-on DevOps, FinOps, and GenAI support that drives real results.

We work across EMEA and the US, fueling innovation and solving complex challenges daily. Join us to grow your skills, shape bold ideas, and help build the future of tech.

We’re looking for a Senior Data Engineer to play a key role in building our Data & Analytics practice and delivering modern data solutions on AWS for our clients. In this role, you'll be a customer-facing, hands-on technical engineer who designs and implements end-to-end data pipelines and analytics platforms using AWS services like AWS Glue, Amazon OpenSearch Service, Amazon Redshift, and Amazon QuickSight. From migrating legacy ETL workflows to AWS Glue to building scalable data lakes for AI/ML training, you'll ensure our customers can unlock the full value of their data. You’ll work closely with client stakeholders (from startup founders and CTOs to data engineers) to create secure, cost-efficient architectures that drive real business impact.

📍 Work location - remote from Ukraine
If you are interested in this opportunity, please submit your CV in English.

Responsibilities
- Design, develop, and deploy AWS-based data and analytics solutions to meet customer requirements. Ensure architectures are highly available, scalable, and cost-efficient.
- Develop dashboards and analytics reports using Amazon QuickSight or equivalent BI tools.
- Migrate and modernize existing data workflows to AWS. Re-architect legacy ETL pipelines to AWS Glue and move on-premises data systems to Amazon OpenSearch/Redshift for improved scalability and insights.
- Build and manage multi-modal data lakes and data warehouses for analytics and AI. Integrate structured and unstructured data on AWS (e.g. S3, Redshift) to enable advanced analytics and generative AI model training using tools like SageMaker.
- Implement infrastructure automation and CI/CD for data projects. Use Infrastructure as Code (Terraform) and DevOps best practices to provision AWS resources and continuously integrate/deploy data pipeline code.
- Lead customer workshops and proof-of-concepts (POCs) to demonstrate proposed solutions. Run technical sessions (architecture whiteboards, Well-Architected reviews) to validate designs and accelerate customer adoption.
- Collaborate with engineering teams (Data Scientist, DevOps and MLOps teams) and stakeholders to deliver projects successfully. Ensure solutions follow AWS best practices and security guidelines, and guide client teams in implementing according to the plan.
- Stay up-to-date on emerging data technologies and mentor team members. Continuously learn new AWS services (e.g. AWS Bedrock, Lake Formation) and industry trends, and share knowledge to improve our delivery as we grow the Data & Analytics practice.
Requirements
- 5+ years of experience in data engineering, data analytics, or a related field, including 3+ years of hands-on AWS experience (designing, building, and maintaining data solutions on AWS).
- Production experience with AWS cloud and data services, including building solutions at scale with tools like AWS Glue, Amazon Redshift, Amazon S3, Amazon Kinesis, Amazon OpenSearch Service, etc.
- Skilled in AWS analytics and dashboards tools – hands-on expertise with services such as Amazon QuickSight or other BI tools (Tableau, Power BI) and Amazon Athena.
- Experience with ETL pipelines – ability to build ETL/ELT workflows (using AWS Glue, Spark, Python, SQL).
- Experience with data warehousing and data lakes - ability to design and optimize data lakes (on S3), Amazon Redshift for data warehousing, and Amazon OpenSearch for log/search analytics.
- Proficiency in programming (Python/PySpark) and SQL skills for data processing and analysis.
- Understanding of cloud security and data governance best practices (encryption, IAM, data privacy).
- Excellent communication skills with an ability to explain complex data concepts in clear terms. Comfortable working directly with clients and guiding technical discussions.
- Proven ability to lead end-to-end technical engagements and work effectively in fast-paced, Agile environments.
- AWS certification – AWS certifications, especially in Data Analytics or Machine Learning are a plus.
- DevOps/MLOps knowledge – experience with Infrastructure as Code (Terraform), CI/CD pipelines, containerization, and AWS AI/ML services (SageMaker, Bedrock) is a plus.
Benefits
- Professional training and certifications covered by the company (AWS, FinOps, Kubernetes, etc.)
- International work environment
- Referral program – enjoy cooperation with your colleagues and get a bonus
- Company events and social gatherings (happy hours, team events, knowledge sharing, etc.)
- English classes
- Soft skills training
Country-specific benefits will be discussed during the hiring process.

Automat-it is committed to fostering a workplace that promotes equal opportunities for all and believes that a diverse workforce is crucial to our success. Our recruitment decisions are based on your experience and skills, recognising the value you bring to our team.
More
Indigo Tech Recruiters

· 62 views · 4 applications · 6d

Middle Data Engineer Middle

Full Remote · Countries of Europe or Ukraine · 4 years of experience · B1 - Intermediate
We are helping to find an Data Engineer (Middle, Middle+) for our client (startup — A performance marketing and traffic arbitrage team focused on scaling marketing campaigns using AI automation). About the Role: We are expanding our Data & AI team and...
We are helping to find an Data Engineer (Middle, Middle+) for our client (startup — A performance marketing and traffic arbitrage team focused on scaling marketing campaigns using AI automation).

About the Role:

We are expanding our Data & AI team and looking for a skilled Data Engineer with a strong Python backend background who has transitioned into data engineering. This role is ideal for someone who started as a backend developer (Python) and has at least 1+ year of hands-on data engineering experience, now aiming to grow further in this domain.
You will work closely with our current Data Engineer and AI Engineer to build scalable data platforms, pipelines, and services. This is a high-autonomy position within a young team where you’ll influence data infrastructure decisions, design systems from scratch, and help shape our data-driven foundation.

Key Responsibilities:
- Design, build, and maintain data pipelines and services to support analytics, ML, and AI solutions.
- Work with distributed systems, optimize data processing, and handle large-scale data workloads.
- Collaborate with AI Engineers to support model integration (backend support for ML models, not full deployment responsibility).
- Design solutions for vague or high-level business requirements with strong problem-solving skills.
- Contribute to building a scalable data platform and help set best practices for data engineering in the company.Participate in rapid prototyping (PoCs and MVPs), deploying early solutions, and iterating quickly.
Requirements:
- 4 years of professional experience (with at least 1 year dedicated to data engineering).
- Strong Python backend development experience (service creation, APIs).
- Good understanding of data processing concepts, distributed systems, and system evolution.
- Experience with cloud platforms (AWS preferred, GCP acceptable).
- Familiarity with Docker and containerized environments.
- Experience with Spark, Kubernetes, and optimization of high-load systems.
- Ability to handle loosely defined requirements, propose solutions, and work independently.
- A proactive mindset — technical initiatives tied to business impact are highly valued.
- English sufficient to read technical documentation (working language: Ukrainian/Russian).
Nice-to-Haves:
- Exposure to front-end development (JavaScript/TypeScript) — not required, but a plus.
- Experience with scalable data architectures, stream processing, and data modeling.
- Understanding of the business impact of technical optimizations.
Team & Process:
- You’ll join a growing Data & AI department responsible for data infrastructure, AI agents, and analytics systems.Two interview stages:
- Technical Interview (Python & Data Engineering focus).
- Cultural Fit Interview (expectations, career growth, alignment).
- Autonomy and decision-making freedom in a small, fast-moving team.
More
Solidgate

· 106 views · 6 applications · 18d

Middle/Senior Data Engineer

Countries of Europe or Ukraine · Product · 3 years of experience · A2 - Elementary

Our Mission and Vision At Solidgate, our mission is clear: to empower outstanding entrepreneurs to build exceptional internet companies. We exist to fuel the builders — the ones shaping the digital economy — with the financial infrastructure they deserve....
Our Mission and Vision
At Solidgate, our mission is clear: to empower outstanding entrepreneurs to build exceptional internet companies. We exist to fuel the builders — the ones shaping the digital economy — with the financial infrastructure they deserve. We’re on an ambitious journey to become the #1 payments orchestration platform in the world.

Solidgate is part of Endeavor — a global community of the world’s most impactful entrepreneurs. We’re proud to be the first payment orchestrator from Europe to join — and to share our expertise within a network of outstanding global companies.

As our processing volume is skyrocketing, the number of engineering teams is growing too — we’re already at 14. This gives our Data Engineering function a whole new scale of challenges: not just building data-driven solutions, but creating products and infrastructure that empowers other teams to build them autonomously.

That’s why we’re launching the Data Platform direction and looking for a Senior Data Engineer who will own the end-to-end construction of our Data Platform. The mission of the role is to build products that allow other teams to quickly launch, scale, and manage their own data-driven solutions independently.

You can check out the overall tech stack of the product here https://solidgate-tech.github.io/

What you’ll own

— Build the Data Platform from scratch (architecture, design, implementation, scaling)
— Implement a Data Lake approach and Layered Architecture (bronze → silver data layers)
— Integrate streaming processing into data engineering practices
— Foster a strong engineering culture with the team and drive best practices in data quality, observability, and reliability

You’re a great fit if you have

— 3+ years of commercial experience as a Data Engineer
— Strong hands-on experience building data solutions in Python
— Confident SQL skills
— Experience with Airflow or similar tools
— Experience building and running DWH (BigQuery / Snowflake / Redshift)
— Expertise in streaming stacks (Kafka / AWS Kinesis)
— Experience with AWS infrastructure: S3, Glue, Athena
— High attention to detail
— Proactive, self-driven mindset
— Continuous-learning mentality
— Strong delivery focus and ownership in a changing environment

Nice to have

— Background as an analyst or Python developer
— Experience with DBT, Grafana, Docker, LakeHouse approaches

Why Join Solidgate?
High-impact role. You’re not inheriting a perfect system — you’re building one.
Great product. We’ve built a fintech powerhouse that scales fast. Solidgate isn’t just an orchestration player — it’s the financial infrastructure for modern Internet businesses. From subscriptions to chargeback management, fraud prevention, and indirect tax — we’ve got it covered.
Massive growth opportunity. Solidgate is scaling rapidly — this role will be a career-defining move.
Top-tier tech team. Work alongside our driving force — a proven, results-driven engineering team that delivers. We’re also early adopters of cutting-edge fraud and chargeback prevention technologies from the Schemes.
Modern engineering culture. TBDs, code reviews, solid testing practices, metrics, alerts, and fully automated CI/CD.
The Extras: 30+ days off, unlimited sick leave, free office meals, health coverage, and Apple gear to keep you productive. Courses, conferences, sports and wellness benefits — all designed for ideas, focus, and fun.
Tomorrow’s fintech needs your mindset. Come build it with us.
More
Data Science UA

· 61 views · 2 applications · 7d

Data Engineer (NLP-Focused)

Full Remote · Ukraine · Product · 3 years of experience · B1 - Intermediate

About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...
About us:
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

About the role:
We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.

You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.

Requirements:
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.

Responsibilities:
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
- Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability.
- Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
- Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
- Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance.
- Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.

The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use.
More
Data Science UA

· 84 views · 2 applications · 13d

Data Engineer (NLP-Focused)

Full Remote · Ukraine · Product · 3 years of experience · B1 - Intermediate

About us: Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently...
About us:
Data Science UA is a service company with strong data science and AI expertise. Our journey began in 2016 with uniting top AI talents and organizing the first Data Science tech conference in Kyiv. Over the past 9 years, we have diligently fostered one of the largest Data Science & AI communities in Europe.

About the client:
Our client is an IT company that develops technological solutions and products to help companies reach their full potential and meet the needs of their users. The team comprises over 600 specialists in IT and Digital, with solid expertise in various technology stacks necessary for creating complex solutions.

About the role:
We are looking for a Data Engineer (NLP-Focused) to build and optimize the data pipelines that fuel the Ukrainian LLM and NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling the Data Scientists and ML Engineers to develop cutting-edge language models.

You will work at the intersection of data engineering and machine learning, ensuring that the datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context.

Requirements:
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given the project’s focus.
Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as the NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.

Responsibilities:
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information.
- Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to the language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability.
- Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs.
- Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models.
- Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance.
- Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.

The company offers:
- Competitive salary.
- Equity options in a fast-growing AI company.
- Remote-friendly work culture.
- Opportunity to shape a product at the intersection of AI and human productivity.
- Work with a passionate, senior team building cutting-edge tech for real-world business use.
More
EPAM Systems

· 130 views · 8 applications · 3d

Data Solutions Architect

Full Remote · Ukraine · 7 years of experience · B2 - Upper Intermediate
We are currently seeking a Solution Architect who specializes in data-driven projects to become a part of our Data Practice team in Ukraine. Responsibilities Architect data analytics solutions by leveraging the big data technology stack Develop and...
We are currently seeking a Solution Architect who specializes in data-driven projects to become a part of our Data Practice team in Ukraine.

Responsibilities
- Architect data analytics solutions by leveraging the big data technology stack
- Develop and present detailed technical solution architecture documents
- Collaborate with business stakeholders to define solution requirements and explore case studies/scenarios for future solutions
- Perform solution architecture reviews/audits, compute and present ROI
- Manage the implementation of solutions from setting project requirements and objectives to the solution “go-live”
- Engage in the entire spectrum of pre-sale activities, including direct communication with customers, RFP processing, crafting implementation proposals, and solution architecture presentations to clients, as well as participation in technical discussions with client representatives
- Construct and adhere to a personal education plan in technology stack and solution architecture
- Develop a robust understanding of industry trends and best practices
- Participate in the acquisition of new clients to expand EPAM’s business in the big data sector
Requirements
- Minimum of 7 years' experience required
- Proficiency in hands-on roles as a Big Data Architect with a strong design/development background in Java, Scala, or Python
- Background in delivering data analytics projects and architecture guidelines
- Skills in big data solutions, both on-premises and on cloud platforms such as Amazon Web Services, Microsoft Azure, and Google Cloud
- Production project experience with at least one big data technology is essential
- Batch processing expertise: Hadoop, MapReduce/Spark/Hive
- Understanding of NoSQL databases: Cassandra, HBase, Accumulo, Kudu
- Familiarity with Agile development methodology, particularly Scrum
- Competency in client communication and pre-sales business-consulting to large enterprise environments
- Experience within a consulting firm and pre-sales backgrounds are highly desirable
- Upper-Intermediate level in English, both spoken and written (B2+)
More
Mindy Support

· 79 views · 13 applications · 27d

Lead Data Engineer

Full Remote · Worldwide · 5 years of experience · C1 - Advanced
Mindy Support is a global provider of data collection, annotation, and curation services, partnering with leading global technology companies. Our mission is to deliver high-quality, ethically sourced data that fuels the next generation of AI/ML products...
Mindy Support is a global provider of data collection, annotation, and curation services, partnering with leading global technology companies.
Our mission is to deliver high-quality, ethically sourced data that fuels the next generation of AI/ML products and solutions. We combine people, process, and technology to deliver reliable data solutions at a large scale.
Role Overview

We are seeking an experienced Data Engineer to design, build, and optimize data pipelines and infrastructure, supporting our annotation and curation workflows.
Beyond engineering excellence, this role requires strong client-facing skills. You will engage directly with enterprise-grade clients to gather requirements, analyze use cases, and ensure project success. You will also contribute to our internal innovation initiatives, helping shape and extend our data services offering.

This role is ideal for a technical leader who can and wants to act at the intersection of data engineering, client engagement, and innovation.
For the right candidate, this position offers a clear growth path to a technical leadership role at the organizational level.

Key Responsibilities
- Data Engineering & Infrastructure
  
  Design, build, and maintain robust ETL/ELT pipelines for diverse data sources (text, image, audio, video) both on our and clients’ infrastructures.
  Develop scalable data processing systems to support data collection/generation, curation, and labeling workflows.
  Ensure data quality, security, and compliance across projects.
  Optimize storage, retrieval, and transformation processes for performance and cost efficiency.
- Client Engagement & Project Coordination
  
  Participate in requirements elicitation, translating client needs into technical solutions.
  Collaborate with our project managers and operations team to align engineering solutions with project goals.
  Help clients to build proper Data Governance frameworks
  Communicate technical aspects, risks, and dependencies to clients and internal stakeholders.
- Leadership & Innovation
  
  Provide technical leadership to our cross-functional team.
  Drive innovation initiatives to extend our portfolio of data services (e.g., automation tools, quality assurance workflows, synthetic data pipelines, advanced analytics etc.).
  Stay up to date with industry trends in Data Engineering, Data Science and overall developments of Generative AI/Agentic AI/Physical AI
Qualifications
- 5+ years in data engineering, with at least 2 years in a senior or lead role;
- Strong proficiency in Python, SQL, and one or more big data frameworks (Spark, Beam, Flink, etc.);
- Experience with cloud platforms (preferably AWS) and data warehouse solutions (BigQuery, Snowflake, Redshift or similar);
- Knowledge of data modeling, pipeline orchestration (Airflow, Prefect, Dagster), and API integration;
- Familiarity with unstructured data processing (text, image, audio, video);
- Advanced to Fluent English;
- Excellent communication and stakeholder management abilities;
- Strong analytical and problem-solving mindset;
- Proven track record in collaborating with clients and cross-functional teams.
Would be an advantage:
- Experience in data annotation field
- Prior involvement in innovation or R&D initiatives
- Relevant professional certifications
- Experience working with American Big Tech (Apple / Google / Amazon / Meta / Microsoft / NVIDIA / OpenAI / Anthropic / Palantir or similar companies)
What We Offer
- Opportunity to work with leading global technology companies shaping the future of AI
- A dynamic environment that values innovation, experimentation, mutual trust and respect
- Career growth pathways into company-wide technical leadership
- Competitive compensation and benefits package
- Flexible remote work arrangement
More
N-iX

· 93 views · 6 applications · 6d

Senior/Lead Data Engineer

Full Remote · Bulgaria, Poland, Romania, Ukraine · 5 years of experience · B2 - Upper Intermediate
We are looking for a Senior/Lead Data Engineer who is passionate about building scalable, cloud-native data solutions and eager to grow into a broader architecture role. You will lead the technical delivery of modern data platform solutions while engaging...
We are looking for a Senior/Lead Data Engineer who is passionate about building scalable, cloud-native data solutions and eager to grow into a broader architecture role. You will lead the technical delivery of modern data platform solutions while engaging with clients and internal stakeholders to design, implement, and optimize enterprise-grade data systems.

Responsibilities
- Lead end-to-end development and implementation of scalable data pipelines, data lakes, and cloud-native data platforms
- Ensure performance, security, and reliability of data solutions through CI/CD, testing, and automation best practices
- Mentor and guide a team of engineers; review code, enforce standards, and foster knowledge sharing
- Participate in solution design with architects and senior stakeholders - contribute to data architecture, system integrations, and technology selection
- Translate business needs into data models, transformation logic, and orchestration workflows
- Gradually take on architecture ownership for projects and support pre-sales with technical scoping and effort estimation
- Work closely with business and technical stakeholders to clarify requirements, demonstrate value, and refine deliverables
- Participate in pre-sales activities: help shape proposals, provide technical insights, and create solution roadmaps
- Stay on top of modern data platform trends (e.g., Data Mesh, Lakehouse, real-time analytics, observability)
- Explore new tools and techniques - contribute to N-iX’s internal Data & Analytics accelerators and reference architectures
Requirements
- 5+ years of hands-on experience in data engineering
- Strong SQL, Python, and Spark skills;
- Strong expertise in AWS or Azure and modern data platforms (e.g., Snowflake, Databricks)
- Knowledge of Airflow, dbt, Kafka, or similar tools preferred
- Proven experience building data lakes, lakehouses, data warehouses, and streaming pipelines
- Familiarity with architectural concepts like Data Mesh, Data Fabric, and cloud-native data architectures
- Experience leading delivery teams or mentoring other engineers
- Excellent verbal and written communication in English
- Willingness to travel for business (when needed) and to engage directly with clients
More
Boosta

· 47 views · 11 applications · 10d

Data Engineer

Full Remote · Countries of Europe or Ukraine · Product · 3 years of experience · B2 - Upper Intermediate Ukrainian Product 🇺🇦
We are Boosta — a holding IT company that creates, scales, and invests in digital businesses with global potential. - Founded in 2014 - 600+ professionals - Hundreds of thousands of users worldwide Boosta’s portfolio includes a wide range of successful IT...
We are Boosta — a holding IT company that creates, scales, and invests in digital businesses with global potential.
- Founded in 2014
- 600+ professionals
- Hundreds of thousands of users worldwide
Boosta’s portfolio includes a wide range of successful IT products, as well as projects focused on performance marketing.
Since 2022, the company’s ecosystem has included its own investment fund, Burner, which provides funding in the formats of Private Equity and Venture Builder.

We’re looking for a Data Engineer to join our team in the iGaming industry, where real-time insights, affiliate performance, and marketing analytics are at the center of decision-making. In this role, you’ll own and scale our data infrastructure, working across affiliate integrations, product analytics, and experimentation workflows.
Your primary responsibilities will include building and maintaining data pipelines, implementing automated data validation, integrating external data sources via APIs, and creating dashboards to monitor data quality, consistency, and reliability.
You’ll collaborate daily with the Affiliate Management team, Product Analysts, and Data Scientists to ensure the data powering our reports and models is clean, consistent, and reliable.

Key Responsibilities
● Design, develop, and maintain ETL/ELT pipelines to transform raw, multi-source data into clean, analytics-ready tables in Google BigQuery, using tools such as dbt for modular SQL transformations, testing, and documentation
● Integrate and automate affiliate data workflows, replacing manual processes in collaboration with the related stakeholders
● Proactively monitor and manage data pipelines using tools such as Airflow, with proper alerting and retry mechanisms in place
● Emphasize data quality, consistency, and reliability by implementing robust validation checks
● Build a Data Consistency Dashboard (in Looker Studio, Power BI, Tableau or Grafana) to track schema mismatches, partner anomalies, and source freshness, with built-in alerts and escalation logic
● Ensure timely availability and freshness of all critical datasets, resolving latency and reliability issues quickly and sustainably
● Control access to cloud resources, implement data governance policies, and ensure secure, structured access across internal teams
● Monitor and optimize data infrastructure costs, particularly related to BigQuery usage, storage, and API-based ingestion
● Document all pipelines, dataset structures, transformation logic, and data contracts clearly to support internal alignment and knowledge sharing
● Build and maintain postback-based ingestion pipelines to support event-level tracking and attribution across the affiliate ecosystem
● Collaborate closely with Data Scientists and Product Analysts to deliver high-quality, structured datasets for modeling, experimentation, and KPI reporting

Skills & Experience
● Strong proficiency in SQL and Python
● Experience with Google BigQuery and other GCP tools (e.g., Cloud Storage, Cloud Functions, Composer)
● Proven ability to design, deploy, and scale ETL/ELT pipelines
● Hands-on experience integrating and automating data from various platforms
● Familiarity with postback tracking, attribution logic, and affiliate data reconciliation
● Skilled in orchestration tools like Airflow or similar
● Experience with visualization tools like Looker Studio, Power BI, Tableau, or Grafana for building dashboards for data quality monitoring and business needs
● Experience with Git for version control and Docker
● Exposure to iGaming data structures and KPIs is a strong advantage
● Strong sense of data ownership, documentation, and operational excellence
● Good communication skills with different stakeholders
● Upper-intermediate English language proficiency

HOW IT WORKS
Stage 1: CV and short questionary
Stage 2: pre-screen with a recruiter
Stage 3: test task
Stage 4: interview.
Stage 5: final intrview
Stage 6: reference check & offer!

WHAT WE OFFER
- 28 business days of paid off
- Flexible hours and the possibility to work remotely
- Medical insurance and mental health care
- Compensation for courses, trainings
- English classes and speaking clubs
- Internal library, educational events
- Outstanding corporate parties, teambuildings
More
Kyivstar.Tech

· 14 views · 3 applications · 7d

ETL/RAID developer

Full Remote · Countries of Europe or Ukraine · Product · 4 years of experience
Kyivstar.Tech team is looking for a new colleague for the role of ETL/RAID developer What you will do Development of functionality and ensuring the operation of processes: Creation of orders and aggregation processes using ETL/ WEDO RAID Development...
Kyivstar.Tech team is looking for a new colleague for the role of ETL/RAID developer

What you will do

Development of functionality and ensuring the operation of processes:
- Creation of orders and aggregation processes using ETL/ WEDO RAID
- Development of processes related to data processing, interaction with systems and support of existing processes
- Testing processes and logic developed in streams
- Work on writing and correcting Batch file, Java-script, Python-script, work with API, CSV, TXT, XML, JSON
- Administration of test environments + provision of recommendations for process changes
Qualifications and experience needed
- At least 4 years of experience with SQL programming and development using ETL/ WEDO RAID tools
- Knowledge of Python, Java or similar programming languages will be an advantage
What we offer
- Office or remote — it's up to you: you can work from anywhere, and we will arrange your workplace
- Remote onboarding
- Performance bonuses for everyone (annual or quarterly — depends on the role)
- We train employees: with the opportunity to learn through the company’s library, internal resources, and programs from partners
- Health and life insurance
- Wellbeing program and corporate psychologist
- Reimbursement of expenses for Kyivstar mobile communication
More
Cloudfresh

· 26 views · 0 applications · 6d

Sales Executive (Google Cloud+Google Workspace)

Full Remote · Czechia · Product · 2 years of experience · B2 - Upper Intermediate
Cloudfresh is a Global Google Cloud Premier Partner, Zendesk Premier Partner, Asana Solutions Partner, GitLab Select Partner, Hubspot Platinum Partner, Okta Activate Partner, and Microsoft Partner. Since 2017, we’ve been specializing in the...
Cloudfresh ⛅️ is a Global Google Cloud Premier Partner, Zendesk Premier Partner, Asana Solutions Partner, GitLab Select Partner, Hubspot Platinum Partner, Okta Activate Partner, and Microsoft Partner.
Since 2017, we’ve been specializing in the implementation, migration, integration, audit, administration, support, and training for top-tier cloud solutions. Our products focus on cutting-edge cloud computing, advanced location and mapping, seamless collaboration from anywhere, unparalleled customer service, and innovative DevSecOps.
We are seeking a dynamic Sales Executive to lead our sales efforts for GCP and GWS solutions across the EMEA and CEE regions. The ideal candidate will be a high-performing A-player with experience in SaaS sales, adept at navigating complex sales environments, and driven to exceed targets through strategic sales initiatives.
Requirements:
- Fluency in English and native Czech is essential;
- From 2 years of proven sales experience in SaaS/ IaaS fields, with a documented history of achieving and exceeding sales targets, particularly in enterprise sales;
- Sales experience on GCP and/or GWS specifically;
- Sales or technical certifications related to Cloud Solutions are advantageous;
- Experience in expanding new markets with outbound activities;
- Excellent communication, negotiation, and strategic planning abilities;
- Proficient in managing CRM systems and understanding their strategic importance in sales and customer relationship management.
Responsibilities:
- Develop and execute sales strategies for GCP and GWS solutions, targeting enterprise clients within the Cloud markets across EMEA and CEE;
- Identify and penetrate new enterprise market segments, leveraging GCP and GWS to improve client outcomes;
- Conduct high-level negotiations and presentations with major companies across Europe, focusing on the strategic benefits of adopting GCP and GWS solutions;
- Work closely with marketing and business development teams to align sales strategies with broader company goals;
- Continuously assess the competitive landscape and customer needs, adapting sales strategies to meet market demands and drive revenue growth.
Work conditions:
- Competitive Salary & Transparent Motivation: Receive a competitive base salary with commission on sales and performance-based bonuses, providing clear financial rewards for your success.
- Flexible Work Format: Work remotely with flexible hours, allowing you to balance your professional and personal life efficiently.
- Freedom to Innovate: Utilize multiple channels and approaches for sales, allowing you the freedom to find the best strategies for success.
- Training with Leading Cloud Products: Access in-depth training on cutting-edge cloud solutions, enhancing your expertise and equipping you with the tools to succeed in an ever-evolving industry.
- International Collaboration: Work alongside A-players and seasoned professionals in the cloud industry. Expand your expertise by engaging with international markets across the EMEA and CEE regions.
- Vibrant Team Environment: Be part of an innovative, dynamic team that fosters both personal and professional growth, creating opportunities for you to advance in your career.
- When applying to this position, you consent to the processing of your personal data by CLOUDFRESH for the purposes necessary to conduct the recruitment process, in accordance with Regulation (EU) 2016/679 of the European Parliament and of the Council of April 27, 2016 (GDPR).
- Additionally, you agree that CLOUDFRESH may process your personal data for future recruitment processes.
More
StellarTech

· 47 views · 8 applications · 19 September

Data Engineer

Full Remote · EU · Product · 7 years of experience · B2 - Upper Intermediate
Hello, fellow data engineers! We are Stellartech - an educational technology product company, and we believe in inspiration but heavily rely on data. And we are looking for a true pipeline detective and zombie process hunter! Why? Because we trust our...
Hello, fellow data engineers! We are Stellartech - an educational technology product company, and we believe in inspiration but heavily rely on data. And we are looking for a true pipeline detective and zombie process hunter!

Why? Because we trust our Data Platform for daily business decisions. From “What ad platform presents us faster? Which creative media presents our value to customers in the most touching way?” to “What would our customers like to learn the most about? What can make education more enjoyable?”, we rely on numbers, metrics and stuff. But as we are open and curious, there’s a lot to collect and measure! That’s why we need to extend, improve and speed up our data platform.

That’s why we need you to:
- Build and maintain scalable data pipelines using Python and Airflow to provide data ingestion, transformation, and delivery.
- Develop and optimize ETL/ELT workflows to ensure data quality, reliability, and performance.
- Bring your vision and opinion to define data requirements and shape solutions to business needs.
- Smartly monitor, relentlessly troubleshoot, and bravely resolve issues in data workflows, striving for high availability and fault tolerance.
- Propose, advocate, and implement best practices for data storage and querying using AWS services such as S3 and Athena.
- Document data workflows and processes, ensuring you don’t have to say it twice and have time for creative experiments. Sure, it’s about clarity and maintainability across the team as well.
For that, we suppose you’d be keen on
- AWS services such as S3, Kinesis, Athena, and others.
- dbt and Airflow for data pipeline and workflow management.
- Application of data architecture, ETL/ELT processes, and data modeling.
- Advanced SQL and Python programming.
- Monitoring tools and practices to ensure data pipeline reliability.
- CI/CD pipelines and DevOps practices for data platforms.
- Monitoring and optimizing platform performance at scale.
Will be nice to
- Understand cloud services (we use AWS), advances, trade-offs, and perspectives.
- Keep in mind the analytical approach and the ability to consider future perspectives in system design in daily practice and technical decisions
Why You'll Love Working With Us:
- Impactful Work: Your contributions will directly shape the future of our company.
- Innovative Environment: We're all about trying new things and pushing the envelope in EdTech.
- Freedom: flexible role based either remotely or hybrid from one of our offices in Cyprus, Poland.
- Health: we offer Health Insurance package for hybrid mode (Cyprus, Poland) and health corner in the Cyprus office.
- AI solutions — GPT Chat bot/ Chat GPT subscription and other tools.
- Wealth: we offer a competitive salary.
- Balance: flexible paid time off, you get 21 days of annual leave + 10 bank holidays.
- Collaborative Culture: Work alongside passionate professionals who are as driven as you are.
More
Uvik Software

· 44 views · 7 applications · 19 September

Senior Data Engineer

Full Remote · Countries of Europe or Ukraine · 5 years of experience · B2 - Upper Intermediate
At Uvik Software, we assemble high-performing product teams and ship scalable solutions for global brands. We’re hiring an experienced Data Engineer for a long-term B2C product, with a core focus on Zero-ETL data flows—moving data where it needs to be...
At Uvik Software, we assemble high-performing product teams and ship scalable solutions for global brands. We’re hiring an experienced Data Engineer for a long-term B2C product, with a core focus on Zero-ETL data flows—moving data where it needs to be with minimal friction and maximum reliability.
What you’ll do
- Design & run Zero-ETL pipelines that are resilient, observable, and cost-efficient
- Model and optimize data lakes/warehouses on AWS (Glue, Firehose, Lambda, SageMaker)
- Work with structured & unstructured data; enforce data quality, lineage, and consistency
- Tune Spark/SQL/Python jobs for speed, scalability, and lower compute spend
- Partner with engineers, analysts, and stakeholders to deliver data products that drive decisions
What we’re looking for
- 5+ years in Data Engineering
- Advanced Spark, Python, SQL skills
- Hands-on with AWS Glue, Kinesis Firehose, Lambda, SageMaker
- Experience with ETL/ELT tooling (dbt, Airflow, etc.)
- B2C domain background is a strong plus
- Bonus: JavaScript familiarity and/or Data Science exposure
- Degree in CS (nice to have, not required)
Why Uvik
- Remote-first: work from anywhere
- Supportive senior team and a culture of ownership
- Flexible schedule with clear growth paths
- Access to courses, conferences, and certifications
If you’re passionate about building fast, reliable, Zero-ETL data platforms that power real user experiences, we’d love to meet you.
Apply now—let’s build something impactful together.
More

Jobs feed in RSS