Middle Data Engineer
Our is a Fortune 500 company. As a leading business-to-business organization, more than 3.2 million customers rely on its products in categories such as safety, material handling, and metalworking, along with services like inventory management and technical support.
We are seeking a skilled Data Engineer with 3-5 years of experience to join our growing data team. In this role, you will be instrumental in building and maintaining robust, scalable data platforms that process large volumes of diverse data.
You won't just be writing SQL queries; you will be embracing the full Modern Data Stack. You will use Python and Airflow to orchestrate complex workflows, dbt to manage sophisticated transformations within our Lakehouse environments (Snowflake/Databricks), and Terraform to ensure our infrastructure is scalable and reproducible as code.
This is an excellent opportunity for an engineer who wants to move beyond traditional ETL and work in a true DataOps environment, potentially bridging the gap between complex legacy sources (like SAP) and modern analytics.
Key Responsibilities:
Pipeline Development & Orchestration
- Design, develop, and maintain reliable ETL/ELT pipelines using Python and SQL to ingest data from various sources into our data lake/warehouse.
- Orchestrate complex data workflows and dependencies using Apache Airflow, ensuring timely data delivery and robust failure handling.
Data Transformation & Modeling
- Champion the use of dbt (data build tool) for developing, testing, and documenting data transformation logic within the warehouse.
- Develop clean, highly optimized SQL models for reporting and analytics (data modeling concepts like Star Schema or Data Vault are a plus).
Platform & Infrastructure Management
- Work hands-on with both Snowflake and Databricks, optimizing compute resources, managing access controls, and ensuring high performance for end-users.
- Utilize Terraform to provision and manage cloud infrastructure (e.g., S3 buckets, IAM roles, Snowflake warehouses) in an Infrastructure-as-Code paradigm.
Data Quality & Reliability
- Implement data quality checks and monitoring within pipelines to ensure the accuracy and integrity of our data.
- Troubleshoot pipeline failures, identify performance bottlenecks, and implement long-term fixes.
Requirements (Must-Haves):
- Experience: 2+ years of professional experience in data engineering or backend software engineering with a data focus.
- Programming: Strong proficiency in Python for data manipulation and scripting, and expert-level SQL skills for complex querying and performance tuning.
- Modern Data Warehouse: Hands-on production experience with modern cloud data platforms, specifically Snowflake and Databricks. You should understand their architecture, compute models, and best practices.
- Transformation: Proven experience using dbt in a production environment for transformation layers.
- Orchestration: Experience building and managing complex DAGs in Apache Airflow.
- Cloud platform experience: AWS
- Infrastructure as Code: Working knowledge of Terraform for deploying and managing cloud resources.
Preferred Qualifications (Nice-to-Haves):
- SAP Exposure: Experience extracting data from SAP ECC or SAP S/4HANA systems. Understanding standard SAP tables and data structures is a significant plus.
We offer*:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
*not applicable for freelancers
Required languages
| English | B2 - Upper Intermediate |
| Ukrainian | B2 - Upper Intermediate |