Jobs
167-
Β· 19 views Β· 0 applications Β· 6d
Big Data Engineer
Full Remote Β· Ukraine Β· Product Β· 3 years of experience Β· B2 - Upper IntermediateWe are looking for a Data Engineer to build and optimize the data pipelines that fuel our Ukrainian LLM and Kyivstarβs NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata,...We are looking for a Data Engineer to build and optimize the data pipelines that fuel our Ukrainian LLM and Kyivstarβs NLP initiatives. In this role, you will design robust ETL/ELT processes to collect, process, and manage large-scale text and metadata, enabling our data scientists and ML engineers to develop cutting-edge language models. You will work at the intersection of data engineering and machine learning, ensuring that our datasets and infrastructure are reliable, scalable, and tailored to the needs of training and evaluating NLP models in a Ukrainian language context. This is a unique opportunity to shape the data foundation of a pioneering AI project in Ukraine, working alongside NLP experts and leveraging modern big data technologies.
What you will do
- Design, develop, and maintain ETL/ELT pipelines for gathering, transforming, and storing large volumes of text data and related information. Ensure pipelines are efficient and can handle data from diverse sources (e.g., web crawls, public datasets, internal databases) while maintaining data integrity.
- Implement web scraping and data collection services to automate the ingestion of text and linguistic data from the web and other external sources. This includes writing crawlers or using APIs to continuously collect data relevant to our language modeling efforts.
- Implementation of NLP/LLM-specific data processing: cleaning and normalization of text, like filtering of toxic content, de-duplication, de-noising, detection, and deletion of personal data.
- Formation of specific SFT/RLHF datasets from existing data, including data augmentation/labeling with LLM as teacher.
- Set up and manage cloud-based data infrastructure for the project. Configure and maintain data storage solutions (data lakes, warehouses) and processing frameworks (e.g., distributed compute on AWS/GCP/Azure) that can scale with growing data needs.
- Automate data processing workflows and ensure their scalability and reliability. Use workflow orchestration tools like Apache Airflow to schedule and monitor data pipelines, enabling continuous and repeatable model training and evaluation cycles.
- Maintain and optimize analytical databases and data access layers for both ad-hoc analysis and model training needs. Work with relational databases (e.g., PostgreSQL) and other storage systems to ensure fast query performance and well-structured data schemas.
- Collaborate with Data Scientists and NLP Engineers to build data features and datasets for machine learning models. Provide data subsets, aggregations, or preprocessing as needed for tasks such as language model training, embedding generation, and evaluation.
- Implement data quality checks, monitoring, and alerting. Develop scripts or use tools to validate data completeness and correctness (e.g., ensuring no critical data gaps or anomalies in the text corpora), and promptly address any pipeline failures or data issues. Implement data version control.
- Manage data security, access, and compliance. Control permissions to datasets and ensure adherence to data privacy policies and security standards, especially when dealing with user data or proprietary text sources.
Qualifications and experience needed
- Education & Experience: 3+ years of experience as a Data Engineer or in a similar role, building data-intensive pipelines or platforms. A Bachelorβs or Masterβs degree in Computer Science, Engineering, or a related field is preferred. Experience supporting machine learning or analytics teams with data pipelines is a strong advantage.
- NLP Domain Experience: Prior experience handling linguistic data or supporting NLP projects (e.g., text normalization, handling different encodings, tokenization strategies). Knowledge of Ukrainian text sources and data sets, or experience with multilingual data processing, can be an advantage given our projectβs focus. Understanding of FineWeb2 or a similar processing pipeline approach.
- Data Pipeline Expertise: Hands-on experience designing ETL/ELT processes, including extracting data from various sources, using transformation tools, and loading into storage systems. Proficiency with orchestration frameworks like Apache Airflow for scheduling workflows. Familiarity with building pipelines for unstructured data (text, logs) as well as structured data.
- Programming & Scripting: Strong programming skills in Python for data manipulation and pipeline development. Experience with NLP packages (spaCy, NLTK, langdetect, fasttext, etc.). Experience with SQL for querying and transforming data in relational databases. Knowledge of Bash or other scripting for automation tasks. Writing clean, maintainable code and using version control (Git) for collaborative development.
- Databases & Storage: Experience working with relational databases (e.g., PostgreSQL, MySQL), including schema design and query optimization. Familiarity with NoSQL or document stores (e.g., MongoDB) and big data technologies (HDFS, Hive, Spark) for large-scale data is a plus. Understanding of or experience with vector databases (e.g., Pinecone, FAISS) is beneficial, as our NLP applications may require embedding storage and fast similarity search.
- Cloud Infrastructure: Practical experience with cloud platforms (AWS, GCP, or Azure) for data storage and processing. Ability to set up services such as S3/Cloud Storage, data warehouses (e.g., BigQuery, Redshift), and use cloud-based ETL tools or serverless functions. Understanding of infrastructure-as-code (Terraform, CloudFormation) to manage resources is a plus.
- Data Quality & Monitoring: Knowledge of data quality assurance practices. Experience implementing monitoring for data pipelines (logs, alerts) and using CI/CD tools to automate pipeline deployment and testing. An analytical mindset to troubleshoot data discrepancies and optimize performance bottlenecks.
- Collaboration & Domain Knowledge: Ability to work closely with data scientists and understand the requirements of machine learning projects. Basic understanding of NLP concepts and the data needs for training language models, so you can anticipate and accommodate the specific forms of text data and preprocessing they require. Good communication skills to document data workflows and to coordinate with team members across different functions.
A plus would be
- Advanced Tools & Frameworks: Experience with distributed data processing frameworks (such as Apache Spark or Databricks) for large-scale data transformation, and with message streaming systems (Kafka, Pub/Sub) for real-time data pipelines. Familiarity with data serialization formats (JSON, Parquet) and handling of large text corpora.
- Web Scraping Expertise: Deep experience in web scraping, using tools like Scrapy, Selenium, or Beautiful Soup, and handling anti-scraping challenges (rotating proxies, rate limiting). Ability to parse and clean raw text data from HTML, PDFs, or scanned documents.
- CI/CD & DevOps: Knowledge of setting up CI/CD pipelines for data engineering (using GitHub Actions, Jenkins, or GitLab CI) to test and deploy changes to data workflows. Experience with containerization (Docker) to package data jobs and with Kubernetes for scaling them is a plus.
- Big Data & Analytics: Experience with analytics platforms and BI tools (e.g., Tableau, Looker) used to examine the data prepared by the pipelines. Understanding of how to create and manage data warehouses or data marts for analytical consumption.
- Problem-Solving: Demonstrated ability to work independently in solving complex data engineering problems, optimising existing pipelines, and implementing new ones under time constraints. A proactive attitude to explore new data tools or techniques that could improve our workflows.
What we offer
- Office or remote β itβs up to you. You can work from anywhere, and we will arrange your workplace.
- Remote onboarding.
- Performance bonuses.
- We train employees with the opportunity to learn through the companyβs library, internal resources, and programs from partners.β―
- Health and life insurance.
- Wellbeing program and corporate psychologist.
- Reimbursement of expenses for Kyivstar mobile communication.
-
Β· 23 views Β· 2 applications Β· 2d
Data Engineer
Ukraine Β· 5 years of experience Β· B2 - Upper IntermediateOn behalf of our Client from France, Mobilunity is looking for a Senior Data Engineer for a 3-month engagement. Our client is a table management software and CRM that enables restaurant owners to welcome their customers easily. The app is useful to...On behalf of our Client from France, Mobilunity is looking for a Senior Data Engineer for a 3-month engagement.
Our client is a table management software and CRM that enables restaurant owners to welcome their customers easily. The app is useful to manage booking requests and register new bookings. You can view all your bookings, day after day, wherever you are and optimize your restaurantβs occupation rate. Our client offers a commission-free booking solution that guarantees freedom above all. New technologies thus become the restaurateurs best allies for saving time and gaining customers while ensuring a direct relationship with them.
Their goal is to become the #1 growth platform for Restaurants. They believe that restaurants have become lifestyle brands, and with forward-thinking digital products, restauranteurs will create the same perfect experience online as they already do offline, resulting in a more valuable, loyalty-led business.
Our client is looking for a Senior Engineer to align key customer data across Salesforce, Chargebee, Zendesk, other tools and their Backβoffice. The goal is a dedicated, historized βCustomer 360β³ table at restaurant and contact levels that exposes discrepancies and gaps, supports updates/cleaning across systems where appropriate, and includes monitoring and Slack alerts.
Tech Stack: Databricks (Delta/Unity Catalog), Python, SQL, Slack.
Responsibilities:
- Design and build a consolidated Customer 360 table in Databricks that links entities across Salesforce, Chargebee, Zendesk, and Backβoffice (entity resolution, deduplication, survivorship rules)
- Implement data cleaning and standardization rules; where safe and approved, update upstream systems via Python/API
- Historize customer attributes to track changes over time
- Create robust data quality checks (completeness, consistency across systems, referential integrity, unexpected changes) and surface issues via Slack alerts
- Establish operational monitoring: freshness SLAs, job success/failure notifications
- Document schemas, matching logic, cleaning rules, and alert thresholds; define ownership and escalation paths
Requirements:
- 5+ years in data engineering/analytics engineering with strong Python/SQL skills
- Handsβon experience with Databricks (Delta, SQL, PySpark optional) and building production data models
- Experience integrating with external SaaS APIs (e.g., Salesforce REST/Bulk, Zendesk, Chargebee) including auth, rate limiting, retries, and idempotency
- Solid grasp of entity resolution, deduplication, and survivorship strategies; strong SQL
- Experience implementing data quality checks and alerting (Slack/webhooks or similar)
- Securityβminded when handling PII (access control, minimization, logging)
- Proficient with Git and PR-based workflows (Databricks Repos, code review, versioning)
- Upper-intermediate, close to advance English
Nice to have:
- Experience with Databricks (Delta/Unity Catalog)
- Background in MDM/Golden Record/Customer 360 initiatives
- Experience with CI/CD for data (tests, code review, environments) and Databricks Jobs for scheduling
Success Criteria (by end of engagement):
- Production Customer 360 table with documented matching logic and survivorship rules
- Data is cleaned and consistent across systems where business rules permit; change history persisted
- Automated data quality checks and Slack alerts in place; clear runbooks for triage
- Documentation and ownership model delivered; stakeholders can self-serve the aligned view
In return we offer:
- The friendliest community of like-minded IT-people
- Open knowledge-sharing environment β exclusive access to a rich pool of colleagues willing to share their endless insights into the broadest variety of modern technologies
- Perfect office location in the city-center (900m from Lukyanivska metro station with a green and spacious neighborhood) or remote mode engagement: you can choose a convenient one for you, with a possibility to fit together both
- No open-spaces setup β separate rooms for every teamβs comfort and multiple lounge and gaming zones
- English classes in 1-to-1 & group modes with elements of gamification
- Neverending fun: sports events, tournaments, music band, multiple affinity groups
π³Come on board, and letβs grow together!π³
More -
Β· 29 views Β· 5 applications Β· 2d
Middle Data Engineer
Full Remote Β· EU Β· Product Β· 3 years of experience Β· B1 - IntermediateFAVBET Tech develops software that is used by millions of players around the world for the international company FAVBET Entertainment. We develop innovations in the field of gambling and betting through a complex multi-component platform which is capable...FAVBET Tech develops software that is used by millions of players around the world for the international company FAVBET Entertainment.
We develop innovations in the field of gambling and betting through a complex multi-component platform which is capable to withstand enormous loads and provide a unique experience for players.
FAVBET Tech does not organize and conduct gambling on its platform. Its main focus is software development.We are looking for a Middle/Senior Data Engineer to join our Data Integration Team.
Main areas of work:- Betting/Gambling Platform Software Development β software development that is easy to use and personalized for each customer.
- Highload Development β development of highly loaded services and systems.
- CRM System Development β development of a number of services to ensure a high level of customer service, effective engagement of new customers and retention of existing ones.
- Big Data β development of complex systems for processing and analysis of big data.
- Cloud Services β we use cloud technologies for scaling and business efficiency
Responsibilities:
- Design, build, install, test, and maintain highly scalable data management systems.
- Develop ETL/ELT processes and frameworks for efficient data transformation and loading.
- Implement, optimize, and support reporting solutions for the Sportsbook domain.
- Ensure effective storage, retrieval, and management of large-scale data.
- Improve data query performance and overall system efficiency.
- Collaborate closely with data scientists and analysts to deliver data solutions and actionable insights.
Requirements:
- At least 2 years of experience in designing and implementing modern data integration solutions.
- Masterβs degree in Computer Science or a related field.
- Proficiency in Python and SQL, particularly for data engineering tasks.
- Hands-on experience with data processing, ETL (Extract, Transform, Load), ELT (Extract, Load, Transform) processes, and data pipeline development.
- Experience with DBT framework and Airflow orchestration.
- Practical experience with both SQL and NoSQL databases (e.g., PostgreSQL, MongoDB).
- Experience with Snowflake.
- Working knowledge of cloud services, particularly AWS (S3, Glue, Redshift, Lambda, RDS, Athena).
- Experience in managing data warehouses and data lakes.
- Familiarity with star and snowflake schema design.
- Understanding of the difference between OLAP and OLTP.
Would be a plus:
- Experience with other cloud data services (e.g., AWS Redshift, Google BigQuery).
- Experience with version control tools (e.g., GitHub, GitLab, Bitbucket).
- Experience with real-time data processing (e.g., Kafka, Flink).
- Familiarity with orchestration tools (e.g., Airflow, Luigi).
- Experience with monitoring and logging tools (e.g., ELK Stack, Prometheus, CloudWatch).
- Knowledge of data security and privacy practices.
We offer:
- 30 day off β we value rest and recreation;
- Medical insurance for employees and the possibility of training employees at the expense of the company and gym membership;
- Remote work or the opportunity β our own modern lofty office with spacious workplace, and brand-new work equipment (near Pochaina metro station);
- Flexible work schedule β we expect a full-time commitment but do not track your working hours;
- Flat hierarchy without micromanagement β our doors are open, and all teammates are approachable.
During the war, the company actively supports the Ministry of Digital Transformation of Ukraine in the initiative to deploy an IT army and has already organized its own cyber warfare unit, which makes a crushing blow to the enemyβs IT infrastructure 24/7, coordinates with other cyber volunteers and plans offensive actions on its IT front line.
More -
Β· 35 views Β· 0 applications Β· 7d
Lead Data Engineer IRC277440
Full Remote Β· Ukraine Β· 7 years of experience Β· B2 - Upper IntermediateThe GlobalLogic technology team is focused on next-generation health capabilities that align with the clientβs mission and vision to deliver Insight-Driven Care. This role operates within the Health Applications & Interoperability subgroup of our broader...The GlobalLogic technology team is focused on next-generation health capabilities that align with the clientβs mission and vision to deliver Insight-Driven Care. This role operates within the Health Applications & Interoperability subgroup of our broader team, with a focus on patient engagement, care coordination, AI, healthcare analytics, and interoperability. These advanced technologies enhance our product portfolio with new services while improving clinical and patient experiences.
As part of the GlobalLogic team, you will grow, be challenged, and expand your skill set working alongside highly experienced and talented people.
If this sounds like an exciting opportunity for you, send over your CV!
Requirements
MUST HAVE
- AWS Platform: Working experience with AWS data technologies, including S3 and AWS SageMaker (SageMaker Unified is a plus)
- Programming Languages: Strong programming skills in Python
- Data Formats: Experience with JSON, XML and other relevant data formats
- CI/CD Tools: experience setting up and managing CI/CD pipelines using GitLab CI, Jenkins, or similar tools
Scripting and automation: experience in scripting languages such as Python, PowerShell, etcβ¦ - Monitoring and Logging: Familiarity with monitoring & logging tools like CloudWatch, ELK, Dynatrace, Prometheus, etcβ¦
- Source Code Management: Expertise with git commands and associated VCS (Gitlab, Github, Gitea or similar)
- Documentation: Experience with markdown and, in particular, Antora for creating technical documentation
NICE TO HAVE
Previous Healthcare or Medical Device experienceHealthCare Interoperability Tools: Previous experience with integration engines such as Intersystems, Lyniate, Redox, Mirth Connect, etcβ¦
Other data technologies, such as Snowflake, Trino/Starburst
Experience working with Healthcare Data, including HL7v2, FHIR and DICOM
FHIR and/or HL7 Certifications
Building software classified as Software as a Medical Device (SaMD)
Understanding of EHR technologies such as EPIC, Cerner, etcβ¦
Experience implementation of enterprise-grade cyber security & privacy by design into software products
Experience working in Digital Health software
Experience developing global applications
Strong understanding of SDLC β Waterfall & Agile methodologies
Experience leading software development teams onshore and offshoreJob responsibilities
β Develops, documents, and configures systems specifications that conform to defined architecture standards, address business requirements, and processes in the cloud development & engineering.
β Involved in planning of system and development deployment as well as responsible for meeting compliance and security standards.
β API development using AWS services in a scalable, microservices-based architecture
β Actively identifies system functionality or performance deficiencies, executes changes to existing systems, and tests functionality of the system to correct deficiencies and maintain more effective data handling, data integrity, conversion, input/output requirements, and storage.
β May document testing and maintenance of system updates, modifications, and configurations.
β May act as a liaison with key technology vendor technologists or other business functions.
β Function Specific: Strategically design technology solutions that meet the needs and goals of the company and its customers/users.
β Leverages platform process expertise to assess if existing standard platform functionality will solve a business problem or customisation solution would be required.
β Test the quality of a product and its ability to perform a task or solve a problem.
β Perform basic maintenance and performance optimisation procedures in each of the primary operating systems.
β Ability to document detailed technical system specifications based on business system requirements
β Ensures system implementation compliance with global & local regulatory and security standards (i.e. HIPAA, SOCII, ISO27001, etc.)
More -
Β· 31 views Β· 5 applications Β· 29d
Python Cloud Engineer
Full Remote Β· Ukraine Β· 3 years of experience Β· B2 - Upper IntermediateOur partners are building cloud-native backend services, APIs, and background systems designed for scalability, reliability, and high performance. Project span consumer devices, energy, healthcare, and beyond, combining regulated requirements with rapid...Our partners are building cloud-native backend services, APIs, and background systems designed for scalability, reliability, and high performance. Project span consumer devices, energy, healthcare, and beyond, combining regulated requirements with rapid time-to-market and often bringing together a variety of technologies in a single project. You will develop services that process real-time device data, integrate multiple systems, handle high-volume cloud workloads, and power applications across diverse use cases. Make a direct impact by contributing to complex systems that drive innovation across industries.
Necessary skills and qualifications- At least 3 years of commercial experience with Python frameworks (FastAPI, Django REST, etc.)
- Experience with relational/non-relational databases
- Strong knowledge of the Object-Oriented Analysis and Design (OOAD) principles
- Hands-on experience with application performance optimization and low-level debugging
- Practical AWS/Azure engineering experience: creating and securing resources, not just consuming them from code
- Experience with containers and orchestration (Docker, Kubernetes)
- Good knowledge of the HTTP protocol
- Proactive position in solution development and process improvements
- Ability to cooperate with customers and teammates
- Upper-Intermediate English level
Will be a plus- Experience with any other back-end technologies
- Knowledge of communication protocols: MQTT/XMPP/AMQP/RabbitMQ/WebSockets
- Ability to research new technological areas and understand them in depth through self-directed learning
- Skilled in IoT data collection, managing device fleets, and implementing OTA updates
- Familiarity with healthcare data standards (e.g., FHIR, HL7) and HIPAA/GDPR compliance
- Expertise in documenting technical solutions in different formats
-
Β· 59 views Β· 11 applications Β· 29d
Data Solutions Architect
Full Remote Β· Colombia, Poland, Ukraine Β· 5 years of experience Β· C1 - AdvancedWe are looking for you! We are seeking a Data Solutions Architect with deep expertise in data platform design, AdTech systems integration, and data pipeline development for the advertising and media industry. This role requires strong technical knowledge...We are looking for you!
We are seeking a Data Solutions Architect with deep expertise in data platform design, AdTech systems integration, and data pipeline development for the advertising and media industry. This role requires strong technical knowledge in both real-time and batch data processing, with hands-on experience in building scalable, high-performance data architectures across demand-side and sell-side platforms.
As a client-facing technical expert, you will play a key role in project delivery, presales process, technical workshops, and project kick-offs, ensuring that our clients receive best-in-class solutions tailored to their business needs.
Contract type: Gig contract.
Skills and experience you can bring to this role
Qualifications & experience:
- 5+ years of experience in designing and implementing data architectures and pipelines for media and advertising industries that align with business goals, ensure scalability, security, and performance;
- Hands-on expertise with cloud-native and enterprise data platforms, including Snowflake, Databricks, and cloud-native warehousing solutions like AWS Redshift, Azure Synapse, or Google BigQuery;
- Proficiency in Python, Scala, or Java for building data pipelines and ETL workflows;
- Hands-on experience with data engineering tools and frameworks such as Apache Kafka, Apache Spark, Airflow, dbt, or Flink. Batch and stream processing architecture;
- Experience working with and good understanding of relational and non-relational databases (SQL, NoSQL(document-oriented, key-value, columnar stores, etc.);
- Experience in data modelling: Ability to create conceptual, logical, and physical data models;
- Experience designing solutions for one or more cloud providers (AWS, GCP, Azure) and their data engineering services;
- Experience in client-facing technical roles, including presales, workshops, and solutioning discussions;
- Strong ability to communicate complex technical concepts to both technical and non-technical stakeholders.
Nice to have:
- Experience working with AI and machine learning teams, integration of ML models into enterprise data pipelines: model fine-tuning, RAG, MLOps, LLMOps;
- Knowledge of privacy-first architectures and data compliance standards in advertising (e.g., GDPR, CCPA);
- Knowledge of data integration tools such as Apache Airflow, Talend, Informatica, and MuleSoft for connecting disparate systems;
- Exposure to real-time bidding (RTB) systems and audience segmentation strategies.
What impact youβll make
- Architect and implement end-to-end data solutions for advertising and media clients, integrating with DSPs, SSPs, DMPs, CDPs, and other AdTech systems;
- Design and optimize data platforms, ensuring efficient data ingestion, transformation, and storage for both batch and real-time processing;
- Build scalable, secure, and high-performance data pipelines that handle large-scale structured and unstructured data from multiple sources;
- Work closely with client stakeholders to define technical requirements, guide solution designs, and align data strategies with business goals;
- Lead technical discovery sessions, workshops, and presales engagements, acting as a trusted technical advisor to clients;
- Ensure data governance, security, and compliance best practices are implemented within the data architecture;
- Collaborate with data science and machine learning teams, designing data pipelines that support model training, feature engineering, and analytics workflows.
-
Β· 20 views Β· 0 applications Β· 28d
Senior Data Engineer Azure (IRC278989)
Full Remote Β· Ukraine Β· 4 years of experience Β· B2 - Upper IntermediateJob Description Strong hands-on experience with Azure Databricks (DLT Pipelines, Lakeflow Connect, Delta Live Tables, Unity Catalog, Time Travel, Delta Share) for large-scale data processing and analytics Proficiency in data engineering with Apache Spark,...Job Description
- Strong hands-on experience with Azure Databricks (DLT Pipelines, Lakeflow Connect, Delta Live Tables, Unity Catalog, Time Travel, Delta Share) for large-scale data processing and analytics
- Proficiency in data engineering with Apache Spark, using PySpark, Scala, or Java for data ingestion, transformation, and processing
- Proven expertise in the Azure data ecosystem: Databricks, ADLS Gen2, Azure SQL, Azure Blob Storage, Azure Key Vault, Azure Service Bus/Event Hub, Azure Functions, Azure Data Factory, and Azure CosmosDB
- Solid understanding of Lakehouse architecture, Modern Data Warehousing, and Delta Lake concepts
- Experience designing and maintaining config-driven ETL/ELT pipelines with support for Change Data Capture (CDC) and event/stream-based processing
- Proficiency with RDBMS (MS SQL, MySQL, PostgreSQL) and NoSQL databases
- Strong understanding of data modeling, schema design, and database performance optimization
- Practical experience working with various file formats, including JSON, Parquet, and ORC
- Familiarity with machine learning and AI integration within the data platform context
- Hands-on experience building and maintaining CI/CD pipelines (Azure DevOps, GitLab) and automating data workflow deployments
- Solid understanding of data governance, lineage, and cloud security (Unity Catalog, encryption, access control)
- Strong analytical and problem-solving skills with attention to detail
- Excellent teamwork and communication skills
- Upper-Intermediate English (spoken and written)
Job Responsibilities
- Design, implement, and optimize scalable and reliable data pipelines using Databricks, Spark, and Azure data services
- Develop and maintain config-driven ETL/ELT solutions for both batch and streaming data
- Ensure data governance, lineage, and compliance using Unity Catalog and Azure Key Vault
- Work with Delta tables, Delta Lake, and Lakehouse architecture to ensure efficient, reliable, and performant data processing
- Collaborate with developers, analysts, and data scientists to deliver trusted datasets for reporting, analytics, and machine learning use cases
- Integrate data pipelines with event-based and microservice architectures leveraging Service Bus, Event Hub, and Functions
- Design and maintain data models and schemas optimized for analytical and operational workloads
- Identify and resolve performance bottlenecks, ensuring cost efficiency and maintainability of data workflows
- Participate in architecture discussions, backlog refinement, estimation, and sprint planning
- Contribute to defining and maintaining best practices, coding standards, and quality guidelines for data engineering
- Perform code reviews, provide technical mentorship, and foster knowledge sharing within the team
- Continuously evaluate and enhance data engineering tools, frameworks, and processes in the Azure environment
Department/Project Description
GlobalLogic is searching for a motivated, results-driven, and innovative software engineer to join our project team at a dynamic startup specializing in pet insurance. Our client is a leading global holding company that is dedicated to developing an advanced pet insurance claims clearing solution designed to expedite and simplify the veterinary invoice reimbursement process for pet owners.
You will be working on a cutting-edge system built from scratch, leveraging Azure cloud services and adopting a low-code paradigm. The project adheres to industry best practices in quality assurance and project management, aiming to deliver exceptional results.
We are looking for an engineer who thrives in collaborative, supportive environments and is passionate about making a meaningful impact on people's lives. If you are enthusiastic about building innovative solutions and contributing to a cause that matters, this role could be an excellent fit for you.
More
-
Β· 74 views Β· 13 applications Β· 28d
Lead Data Engineer
Full Remote Β· Worldwide Β· 5 years of experience Β· B2 - Upper IntermediateDigis is looking for an experienced, proactive, and self-driven Lead Data Engineer to join our fully remote team. About the Project Youβll be part of a large-scale platform focused on performance management and revenue optimization, powered by...Digis is looking for an experienced, proactive, and self-driven Lead Data Engineer to join our fully remote team.
About the Project
Youβll be part of a large-scale platform focused on performance management and revenue optimization, powered by predictive analytics and machine learning.
The product helps businesses boost employee engagement, customer satisfaction, and overall revenue across industries such as hospitality, automotive, car rentals, and theme parks.Project Details
- Location: USA, India (work hours aligned with the Kyiv timezone)
- Team Composition: CTO (US), 6 Data Engineers, 2 DevOps, and 1 Backend Engineer from Digis.
In total, around 50 professionals are involved in the project. Engagement: Long-term collaboration
What Weβre Looking For
- 5+ years of experience in Data Engineering
- Proven background in a Team Lead or Tech Lead role
- Experience with Spark / PySpark and AWS
Upper-Intermediate (B2+) or higher level of spoken English
Why Join
- Contribute to a stable, industry-leading product
- Take ownership and lead a talented team
- Work with cutting-edge data technologies
- Influence technical decisions and product evolution
Enjoy long-term professional growth within a supportive and innovative environment
If this opportunity resonates with you, letβs connect β weβll be glad to share more details.
More -
Β· 54 views Β· 2 applications Β· 28d
Senior Data Architect
Full Remote Β· Poland, Turkey Β· 10 years of experience Β· B2 - Upper IntermediateProject Description Senior Data Architect responsible for enterprise data architecture, designing conceptual and logical data models that guide Manufacturing, Supply Chain, Merchandising and consumer data transformation at leading jewellery company. In...Project Description
Senior Data Architect responsible for enterprise data architecture, designing conceptual and logical data models that guide Manufacturing, Supply Chain, Merchandising and consumer data transformation at leading jewellery company.
In this strategic data architecture role, you will define enterprise wide data models, create conceptual data architectures, and establish information blueprints that engineering teams implement. You will ensure our data architecture makes information accessible and trustworthy for both analytical insights and AI innovations, enabling Pandora's journey to becoming the world's leading jewellery company.
Responsibilities
- Leading enterprise information architecture design, creating semantic, conceptual, logical models that engineering teams translate into Azure and databricks based physical implementations.
- Architecting data product interfaces and semantic contracts that define how business entities flow through bronze-to-gold transformations, enabling interoperability between domain-owned data products.
- Developing semantic layer architectures that provide business-oriented abstractions, hiding technical complexity while enabling self-service analytics.
- Creating data architectural blueprints for core application transformation (SAP, Salesforce, o9, workforce management etc.) ensuring a smooth data transitional architecture.
- Reviewing and approving physical model proposals from engineering teams, ensuring alignment with enterprise information strategy.
- Defining modeling standards for data products: operational schemas in bronze, aggregated and dimensional entities in silver, and consumption-ready models in gold.
- Designing information value chains and data product specifications with clear contracts for downstream consumption.
- Mapping business capabilities to information domains, creating bounded contexts aligned with Domain-Driven Design principles.
- Providing architectural oversight through design reviews, decision records to ensure adherence to the data standards.
- Creating business glossaries and semantic models that bridge business language and technical implementation.
- Guiding data solution design sessions, translating complex business requirements into clear information structures.
- Mentoring teams on data architectural best practices while documenting anti-patterns to avoid common pitfalls.
Skills Required
- 8+ years as a practicing data architect with proven experience designing enterprise-scale information architectures that successfully went into production.
- Demonstrated experience architecting shared data models in modern lakehouse patterns, including defining silver-layer conformance standards and gold-layer consumption contracts.
- Demonstrated ability to create semantic layers and logical models that achieved actual business adoption.
- Practical experience with multiple modeling paradigms (dimensional, Data Vault, graph, document) knowing when and why to apply each.
- Strong understanding of Azure data platform capabilities (Synapse, Databricks, Unity Catalog) to guide architectural decisions in daily implementations.
- Track record of guiding teams through major transformations, particularly SAP migrations or platform modernizations.
- Excellence in stakeholder communication - explaining data architecture to non-technical audience while providing technical guidance to engineering teams.
- Pragmatic approach balancing theoretical best practices with practical implementation realities.
- Collaborative leadership style - guiding teams to good decisions rather than mandating solutions.
- Experience establishing reusable patterns and reference architectures that accelerate delivery.
- Understanding of how logical models translate to physical implementations.
- Proven ability to create architectures that survive from MVP to enterprise scale.
-
Β· 55 views Β· 20 applications Β· 28d
Data Engineer with Databricks
Full Remote Β· Worldwide Β· 5 years of experience Β· B2 - Upper IntermediateWe are seeking an experienced Data Engineer with deep expertise in Databricks to design, build, and maintain scalable data pipelines and analytics solutions. This role requires 5 years of hands-on experience in data engineering with a strong focus on the...We are seeking an experienced Data Engineer with deep expertise in Databricks to design, build, and maintain scalable data pipelines and analytics solutions. This role requires 5 years of hands-on experience in data engineering with a strong focus on the Databricks platform.
Key Responsibilities:
- Data Pipeline Development & Management
- Design and implement robust, scalable ETL/ELT pipelines using Databricks and Apache Spark
- Process large volumes of structured and unstructured data
- Develop and maintain data workflows using Databricks workflows, Apache Airflow, or similar orchestration tools
- Optimize data processing jobs for performance, cost efficiency, and reliability
- Implement incremental data processing patterns and change data capture (CDC) mechanisms
- Databricks Platform Engineering
- Build and maintain Delta Lake tables and implement medallion architecture (bronze, silver, gold layers)
- Develop streaming data pipelines using Structured Streaming and Delta Live Tables
- Manage and optimize Databricks clusters for various workloads
- Implement Unity Catalog for data governance, security, and metadata management
- Configure and maintain Databricks workspace environments across development, staging, and production
- Data Architecture & Modeling
- Design and implement data models optimized for analytical workloads
- Create and maintain data warehouses and data lakes on cloud platforms (Azure, AWS, or GCP)
- Implement data partitioning, indexing, and caching strategies for optimal query performance
- Collaborate with data architects to establish best practices for data storage and retrieval patterns
- Performance Optimization & Monitoring
- Monitor and troubleshoot data pipeline performance issues
- Optimize Spark jobs through proper partitioning, caching, and broadcast strategies
- Implement data quality checks and automated testing frameworks
- Manage cost optimization through efficient resource utilization and cluster management
- Establish monitoring and alerting systems for data pipeline health and performance
- Collaboration & Best Practices
- Work closely with data scientists, analysts, and business stakeholders to understand data requirements
- Implement version control using Git and follow CI/CD best practices for code deployment
- Document data pipelines, data flows, and technical specifications
- Mentor junior engineers on Databricks and data engineering best practices
- Participate in code reviews and contribute to establishing team standards
Required Qualifications:
- Experience & Skills
- 5+ years of experience in data engineering with hands-on Databricks experience
- Strong proficiency in Python and/or Scala for Spark application development
- Expert-level knowledge of Apache Spark, including Spark SQL, DataFrames, and RDDs
- Deep understanding of Delta Lake and Lakehouse architecture concepts
- Experience with SQL and database optimization techniques
- Solid understanding of distributed computing concepts and data processing frameworks
- Proficiency with cloud platforms (Azure, AWS, or GCP) and their data services
- Experience with data orchestration tools (Databricks Workflows, Apache Airflow, Azure Data Factory)
- Knowledge of data modeling concepts for both OLTP and OLAP systems
- Familiarity with data governance principles and tools like Unity Catalog
- Understanding of streaming data processing and real-time analytics
- Experience with version control systems (Git) and CI/CD pipelines
Preferred Qualifications:
- Databricks Certified Data Engineer certification (Associate or Professional)
- Experience with machine learning pipelines and MLOps on Databricks
- Knowledge of data visualization tools (Power BI, Tableau, Looker)
- Experience with infrastructure as code (Terraform, CloudFormation)
- Familiarity with containerization technologies (Docker, Kubernetes)
-
Β· 18 views Β· 0 applications Β· 27d
Senior Data Engineer (IRC278988)
Full Remote Β· Ukraine Β· 5 years of experience Β· B2 - Upper IntermediateJob Description Strong hands-on experience with Azure Databricks (DLT Pipelines, Lakeflow Connect, Delta Live Tables, Unity Catalog, Time Travel, Delta Share) for large-scale data processing and analytics Proficiency in data engineering with Apache Spark,...Job Description
- Strong hands-on experience with Azure Databricks (DLT Pipelines, Lakeflow Connect, Delta Live Tables, Unity Catalog, Time Travel, Delta Share) for large-scale data processing and analytics
- Proficiency in data engineering with Apache Spark, using PySpark, Scala, or Java for data ingestion, transformation, and processing
- Proven expertise in the Azure data ecosystem: Databricks, ADLS Gen2, Azure SQL, Azure Blob Storage, Azure Key Vault, Azure Service Bus/Event Hub, Azure Functions, Azure Data Factory, and Azure CosmosDB
- Solid understanding of Lakehouse architecture, Modern Data Warehousing, and Delta Lake concepts
- Experience designing and maintaining config-driven ETL/ELT pipelines with support for Change Data Capture (CDC) and event/stream-based processing
- Proficiency with RDBMS (MS SQL, MySQL, PostgreSQL) and NoSQL databases
- Strong understanding of data modeling, schema design, and database performance optimization
- Practical experience working with various file formats, including JSON, Parquet, and ORC
- Familiarity with machine learning and AI integration within the data platform context
- Hands-on experience building and maintaining CI/CD pipelines (Azure DevOps, GitLab) and automating data workflow deployments
- Solid understanding of data governance, lineage, and cloud security (Unity Catalog, encryption, access control)
- Strong analytical and problem-solving skills with attention to detail
- Excellent teamwork and communication skills
- Upper-Intermediate English (spoken and written)
Job Responsibilities
- Design, implement, and optimize scalable and reliable data pipelines using Databricks, Spark, and Azure data services
- Develop and maintain config-driven ETL/ELT solutions for both batch and streaming data
- Ensure data governance, lineage, and compliance using Unity Catalog and Azure Key Vault
- Work with Delta tables, Delta Lake, and Lakehouse architecture to ensure efficient, reliable, and performant data processing
- Collaborate with developers, analysts, and data scientists to deliver trusted datasets for reporting, analytics, and machine learning use cases
- Integrate data pipelines with event-based and microservice architectures leveraging Service Bus, Event Hub, and Functions
- Design and maintain data models and schemas optimized for analytical and operational workloads
- Identify and resolve performance bottlenecks, ensuring cost efficiency and maintainability of data workflows
- Participate in architecture discussions, backlog refinement, estimation, and sprint planning
- Contribute to defining and maintaining best practices, coding standards, and quality guidelines for data engineering
- Perform code reviews, provide technical mentorship, and foster knowledge sharing within the team
- Continuously evaluate and enhance data engineering tools, frameworks, and processes in the Azure environment
Department/Project Description
GlobalLogic is searching for a motivated, results-driven, and innovative software engineer to join our project team at a dynamic startup specializing in pet insurance. Our client is a leading global holding company that is dedicated to developing an advanced pet insurance claims clearing solution designed to expedite and simplify the veterinary invoice reimbursement process for pet owners.
More
You will be working on a cutting-edge system built from scratch, leveraging Azure cloud services and adopting a low-code paradigm. The project adheres to industry best practices in quality assurance and project management, aiming to deliver exceptional results.
We are looking for an engineer who thrives in collaborative, supportive environments and is passionate about making a meaningful impact on people's lives. If you are enthusiastic about building innovative solutions and contributing to a cause that matters, this role could be an excellent fit for you. -
Β· 88 views Β· 15 applications Β· 6d
Senior Data Engineer
Part-time Β· Full Remote Β· Countries of Europe or Ukraine Β· Product Β· 5 years of experience Β· B2 - Upper IntermediateAbout the Platform Weβre building a unified data ecosystem that connects raw data, analytical models, and intelligent decision layers. The platform combines the principles of data lakes, lakehouses, and modern data warehouses β structured around the...About the Platform
Weβre building a unified data ecosystem that connects raw data, analytical models, and intelligent decision layers.
The platform combines the principles of data lakes, lakehouses, and modern data warehouses β structured around the Medallion architecture (Bronze / Silver / Gold).
Every dataset is versioned, governed, and traceable through a unified catalog and lineage framework.
This environment supports analytics, KPI computation, and AI-driven reasoning β designed for performance, transparency, and future scalability. (in Partnership with GCP, OpenAI, Cohere)
What Youβll Work On
1. Data Architecture & Foundations
- Design, implement, and evolve medallion-style data pipelines β from raw ingestion to curated, business-ready models.
- Build hybrid data lakes and lakehouses using Iceberg, Delta, or Parquet formats with ACID control and schema evolution.
- Architect data warehouses that unify batch and streaming sources into a consistent, governed analytics layer.
- Ensure optimal partitioning, clustering, and storage strategies for large-scale analytical workloads.
2. Data Ingestion & Transformation
- Create ingestion frameworks for APIs, IoT, ERP, and streaming systems (Kafka, Pub/Sub).
- Develop reproducible ETL/ELT pipelines using Airflow, dbt, Spark, or Dataflow.
- Manage CDC and incremental data loads, ensuring freshness and resilience.
- Apply quality validation, schema checks, and contract-based transformations at every stage.
3. Governance, Cataloging & Lineage
- Implement a unified data catalog with lineage visibility, metadata capture, and schema versioning.
- Integrate dbt metadata, OpenLineage, and Great Expectations to enforce data quality.
- Define clear governance rules: data contracts, access policies, and change auditability.
- Ensure every dataset is explainable and fully traceable back to its source.
4. Data Modeling & Lakehouse Operations
- Design dimensional models and business data marts to power dashboards and KPI analytics.
- Develop curated Gold-layer tables that serve as trusted sources of truth for analytics and AI workloads.
- Optimize materialized views and performance tuning for analytical efficiency.
- Manage cross-domain joins and unified semantics across products, customers, or operational processes.
5. Observability, Reliability & Performance
- Monitor data pipeline health, freshness, and cost using modern observability tools (Prometheus, Grafana, Cloud Monitoring).
- Build proactive alerting, anomaly detection, and drift monitoring for datasets.
- Implement CI/CD workflows for data infrastructure using Terraform, Helm, and ArgoCD.
- Continuously improve query performance and storage efficiency across warehouses and lakehouses.
6. Unified Data & Semantic Layers
- Help define a unified semantic model that connects operational, analytical, and AI-ready data.
- Work with AI and analytics teams to structure datasets for semantic search, simulation, and reasoning systems.
- Collaborate on vectorized data representation and process-relationship modeling (graph or vector DBs).
What Weβre Looking For- 5+ years of hands-on experience building large-scale data platforms, warehouses, or lakehouses.
- Strong proficiency in SQL, Python, and distributed processing frameworks (PySpark, Spark, Dataflow).
- Deep understanding of Medallion architecture, data modeling, and modern ETL orchestration (Airflow, dbt).
- Experience implementing data catalogs, lineage tracking, and validation frameworks.
- Knowledge of data governance, schema evolution, and contract-based transformations.
- Familiarity with streaming architectures, CDC patterns, and real-time analytics.
- Practical understanding of FinOps, data performance tuning, and cost management in analytical environments.
- Strong foundation in metadata-driven orchestration, observability, and automated testing.
- Bonus: experience with ClickHouse, Trino, Iceberg, or hybrid on-prem/cloud data deployments.
Youβll Excel If You- Think of data systems as living, evolving architectures β not just pipelines.
- Care deeply about traceability, scalability, and explainability.
- Love designing platforms that unify data across analytics, AI, and process intelligence.
- Are pragmatic, hands-on, and focused on building systems that last.
-
Β· 14 views Β· 1 application Β· 27d
Data Architect (Azure Platform)
Full Remote Β· Ukraine, Poland, Romania, Slovakia Β· 8 years of experience Β· C1 - AdvancedDescription As the Data Architect, you will be the senior technical visionary for the Data Platform. You will be responsible for the high-level design of the entire solution, ensuring it is scalable, secure, and aligned with the companyβs long-term...Description
As the Data Architect, you will be the senior technical visionary for the Data Platform. You will be responsible for the high-level design of the entire solution, ensuring it is scalable, secure, and aligned with the companyβs long-term strategic goals. Your decisions will form the technical foundation upon which the entire platform is built, from initial batch processing to future real-time streaming capabilities.
Full remote.Required Skills
(Must-Haves)β Cloud Architecture: Extensive experience designing and implementing large-scale data platforms on Microsoft Azure.
β Expert Technical Knowledge: Deep, expert-level understanding of the Azure data stack, including ADF, Databricks, ADLS, Synapse, and Purview.
β Data Concepts: Mastery of data warehousing, data modeling (star schemas), data lakes, and both batch and streaming architectural patterns.
β Strategic Thinking: Ability to align technical solutions with long-term business strategy.Nice-to-Have Skills:
β Hands-on Coding Ability: Proficiency in Python/PySpark, allowing for the creation of architectural proofs-of-concept.
β DevOps & IaC Acumen: Deep understanding of CI/CD for data platforms and experience with Infrastructure as Code (Bicep/Terraform)/Experience with AzureDevOps for BigData services
β Azure Cost Management: Experience with FinOps and optimizing the cost of Azure data services.
Job responsibilitiesβ End-to-End Architecture Design: Design and document the complete, end-to-end data architecture, encompassing data ingestion, processing, storage, and analytics serving layers.
More
β Technology Selection & Strategy: Make strategic decisions on the use of Azure services (ADF, Databricks, Synapse, Event Hubs) to meet both immediate MVP needs and future scalability requirements.
β Define Standards & Best Practices: Establish data modeling standards, development best practices, and governance policies for the engineering team to follow.
β Technical Leadership: Provide expert technical guidance and mentorship to the data engineers and BI developers, helping them solve the most complex technical challenges.
β Stakeholder Communication: Clearly articulate the architectural vision, benefits, and trade-offs to technical teams, project managers, and senior business leaders. -
Β· 103 views Β· 18 applications Β· 26d
Data Engineer
Full Remote Β· Ukraine Β· Product Β· 2 years of experience Β· B1 - IntermediateFAVBET Tech develops software that is used by millions of players around the world for the international company FAVBET Entertainment. Main areas of work: Game Development β driving the end-to-end engineering process for innovative, engaging, and...FAVBET Tech develops software that is used by millions of players around the world for the international company FAVBET Entertainment.
Main areas of work:- Game Development β driving the end-to-end engineering process for innovative, engaging, and mathematically precise games tailored to global markets.
- Mechanics & Player Experience β overseeing the creation of core gameplay logic and features that maximize engagement and retention while also leading the development of back office admin panels for game configuration, monitoring, and operational efficiency.
- Data-Driven Game Design β implementing analytics and big data solutions to measure player behavior, guide feature development, and improve monetization strategies.
- Cloud Services β we use cloud technologies for scaling and business efficiency.
Responsibilities:
- Design, build, install, test, and maintain highly scalable data management systems.
- Develop ETL/ELT processes and frameworks for efficient data transformation and loading.
- Implement, optimize, and support reporting solutions for the Sportsbook domain.
- Ensure effective storage, retrieval, and management of large-scale data.
- Improve data query performance and overall system efficiency.
- Collaborate closely with data scientists and analysts to deliver data solutions and actionable insights.
Requirements:
- At least 2 years of experience in designing and implementing modern data integration solutions.
- Masterβs degree in Computer Science or a related field.
- Proficiency in Python and SQL, particularly for data engineering tasks.
- Hands-on experience with data processing, ETL (Extract, Transform, Load), ELT (Extract, Load, Transform) processes, and data pipeline development.
- Experience with DBT framework and Airflow orchestration.
- Practical experience with both SQL and NoSQL databases (e.g., PostgreSQL, MongoDB).
- Experience with Snowflake.
- Working knowledge of cloud services, particularly AWS (S3, Glue, Redshift, Lambda, RDS, Athena).
- Experience in managing data warehouses and data lakes.
- Familiarity with star and snowflake schema design.
- Understanding of the difference between OLAP and OLTP.
Would be a plus:
- Experience with other cloud data services (e.g., AWS Redshift, Google BigQuery).
- Experience with version control tools (e.g., GitHub, GitLab, Bitbucket).
- Experience with real-time data processing (e.g., Kafka, Flink).
- Familiarity with orchestration tools (e.g., Airflow, Luigi).
- Experience with monitoring and logging tools (e.g., ELK Stack, Prometheus, CloudWatch).
- Knowledge of data security and privacy practices.
We can offer:- 30 days of paid vacation and sick days β we value rest and recreation. We also comply with the national holidays.
- Medical insurance for employees and the possibility of training employees at the expense of the company and gym membership.
- Remote work; after Ukraine wins the war β our own modern lofty office with spacious workplace, and brand-new work equipment (near Pochaina metro station).
- Flexible work schedule β we expect a full-time commitment but do not track your working hours.
- Flat hierarchy without micromanagement β our doors are open, and all teammates are approachable.
-
Β· 64 views Β· 7 applications Β· 26d
Middle Data Engineer
Full Remote Β· Ukraine, Poland, Romania Β· 3 years of experience Β· B2 - Upper IntermediateDescription Our Client is a Fortune 500 company and is one of the biggest global manufacturing companies operating in the fields of industrial systems, worker safety, health care, and consumer goods. The company is dedicated to creating the technology and...Description
Our Client is a Fortune 500 company and is one of the biggest global manufacturing companies operating in the fields of industrial systems, worker safety, health care, and consumer goods. The company is dedicated to creating the technology and products that advance every business, improve every home and enhance every life.
As a Data Engineer for our Data Mesh platform, you will design, develop, and maintain data pipelines & models, ensuring high-quality, domain-oriented data products. You will collaborate with cross-functional teams and optimize data processes for performance and cost efficiency. Your expertise in big data technologies, cloud platforms, and programming languages will be crucial in driving the success of our Data Mesh initiatives.
Requirements
Minimum Requirements:
- Proficiency in Python for data processing and automation.
- Strong SQL skills for querying and manipulating data.
- Minimum of 3 years of experience in SQL and Python programming languages, specifically for data engineering tasks.
- Good English (min. B2 level).
- Experience with cloud platforms, preferably Azure (Azure Data Factory, Azure Databricks, Azure SQL Database, etc.).
- Experience with Spark and Databricks or similar big data processing and analytics platforms
- Experience working with large data environments, including data processing, data integration, and data warehousing.
- Experience with data quality assessment and improvement techniques, including data profiling, data cleansing, and data validation.
- Familiarity with data lakes and their associated technologies, such as Azure Data Lake Storage, AWS S3, or Delta Lake, for scalable and cost-effective data storage and management.
- Experience with NoSQL databases, such as MongoDB or Cosmos, for handling unstructured and semi-structured data.
Additional Skillsets (Nice to Have):
- Familiarity with Agile and Scrum methodologies, including working with Azure DevOps and Jira for project management.
- Knowledge of DevOps methodologies and practices, including continuous integration and continuous deployment (CI/CD).
- Experience with Azure Data Factory or similar data integration tools for orchestrating and automating data pipelines.
- Ability to build and maintain APIs for data integration and consumption.
- Experience with data backends for software platforms, including database design, optimization, and performance tuning.
Job responsibilities
- Design, develop, and maintain scalable data pipelines and ETL processes.
- Collaborate with cross-functional teams to understand data requirements and deliver high-quality data solutions.
- Implement data quality checks and ensure data integrity across various data sources.
- Optimize and tune data pipelines for performance and scalability.
- Develop and maintain data models and schemas to support data mesh architecture.
- Work with cloud platforms, particularly Azure, to deploy and manage data infrastructure.
- Participate in Agile development processes, including sprint planning, stand-ups, and retrospectives.
- Monitor and troubleshoot data pipeline issues, ensuring timely resolution.
- Document data engineering processes, best practices, and standards.