Senior Data Engineer to $6000
We are looking for a Data Engineer to design, develop, and optimize our data infrastructure on Databricks. You will architect scalable pipelines using BigQuery, Google Cloud Storage, Apache Airflow, dbt, Dataflow, and Pub/Sub, ensuring high availability and performance across our ETL/ELT processes. You will leverage Great Expectations to enforce data quality standards. The role also involves building our Data Mart (Data Mach) environment and implementing CI/CD best practices. A successful candidate has extensive knowledge of cloud-native data solutions, strong proficiency with ETL/ELT frameworks (including dbt), and a passion for building robust, cost-effective pipelines.
Responsibilities:
- Data Architecture & Strategy - Define and implement the overall data architecture on GCP, including data warehousing in BigQuery/Databricks, data lake patterns in Google Cloud Storage, and Data Mart (Data Mach) solutions. Integrate Terraform for Infrastructure as Code to provision and manage cloud resources efficiently. Establish both batch and real-time data processing frameworks to ensure reliability, scalability, and cost efficiency.
- Pipeline Development & Orchestration - Design, build, and optimize ETL/ELT pipelines using Apache Airflow for workflow orchestration. Implement dbt (Data Build Tool) transformations to maintain version-controlled data models in BigQuery, ensuring consistency and reliability across the data pipeline. Use Google Dataflow (based on Apache Beam) and Pub/Sub for large-scale streaming/batch data processing and ingestion. Automate job scheduling and data transformations to deliver timely insights for analytics, machine learning, and reporting.
- Event-Driven & Microservices Architecture - Implement event-driven or asynchronous data workflows between microservices. Employ Docker and Kubernetes (K8s) for containerization and orchestration, enabling flexible and efficient microservices-based data workflows. Implement CI/CD pipelines for streamlined development, testing, and deployment of data engineering components.
- Data Quality, Governance & Security - Enforce data quality standards using Great Expectations or similar frameworks, defining and validating expectations for critical datasets. Define and uphold metadata management, data lineage, and auditing standards to ensure trustworthy datasets. Implement security best practices, including encryption at rest and in transit, Identity and Access Management (IAM), and compliance with GDPR or CCPA where applicable.
- Scientists & Analytics Enablement - Collaborate with Data Science, Analytics, and Product teams to ensure the data infrastructure supports advanced analytics, including machine learning initiatives. Maintain Data Mart (Data Mach) environments that cater to specific business domains, optimizing access and performance for key stakeholders.
Nice to have
- You are familiar with Node.js
- You know what bots are, and excited about creating one
- You understand the concept of Redux (Vuex)
- You’ve heard about Amazon Web Services (i.e, Lambda, DynamoDB)
- You know what Agile/Scrum environment looks like
Requirements:
1. Experience
- 3+ years of professional experience in data engineering, with at least 1 year
in mobile data
2. Technical Expertise with GCP Stack
- Proven track record building and maintaining BigQuery environments and
Google Cloud Storage based data lakes.
- Deep knowledge of Apache Airflow for scheduling/orchestration and ETL/ELT
design.
- Experience implementing dbt for data transformations, RabbitMQ for
event-driven workflows, and Pub/Sub + Dataflow for streaming/batch data
pipelines.
- Familiarity with designing and implementing Data Mart (Data Mach) solutions,
as well as using Terraform for IaC.
3. Programming & Containerization
- Strong coding capabilities in Python, Java, or Scala, plus scripting for
automation.
- Experience with Docker and Kubernetes (K8s) for containerizing data-related
services.
- Hands-on with CI/CD pipelines and DevOps tools (e.g., Terraform, Ansible,
Jenkins, GitLab CI) to manage infrastructure and deployments.
4. Data Quality & Governance
- Proficiency in Great Expectations (or similar) to define and enforce data quality standards.
- Expertise in designing systems for data lineage, metadata management, and
compliance (GDPR, CCPA).
- Strong understanding of OLTP (Online Transaction Processing) and OLAP
(Online Analytical Processing) systems.
5. Communication
- Excellent communication skills for both technical and non-technical
audiences.
- High level of organization, self-motivation, and problem-solving aptitude.
What we offer:
- Competitive compensation depending on experience and skills
- A friendly team of like-minded people
- Opportunities for learning and development
- Compensation for sick leaves
- 21 working days paid vacation + all Poland national holidays
- Corporate events and activities
- Private medical care
- Office work or remote working (based on your location)
Required languages
| English | B2 - Upper Intermediate |
| Ukrainian | Native |