Middle Data Engineer – AI-Native Data Pipelines and Automation $$
We are seeking a Middle Data Engineer to join our AI-Native Delivery Pod. In this role, you will move beyond traditional ETL development to become an "Architect of Data Flows," leveraging AI agents to design, build, and test complex data pipelines.
Your focus will be on data-intensive, No-UI projects where the primary value lies in the integrity, speed, and structure of the data. You will use Claude Code, Cursor, and agentic workflows to automate routine SQL and Python development, allowing you to concentrate on logic validation, data quality assurance, and performance optimization within cloud environments. As part of an AI-Native pod, you will operate at a velocity significantly higher than traditional teams by treating AI as your primary executor.
Required Qualifications
- Degree: Bachelor’s degree in Computer Science, Applied Mathematics, Data Engineering, or a related technical field.
- Professional Experience: 3+ years of experience in Data Engineering, specifically building and maintaining production-grade data pipelines.
- Tech Stack Mastery: Strong proficiency in Python (Pandas, PySpark, Boto3) and advanced SQL (window functions, optimization, procedural SQL).
- Data Tooling: Hands-on experience with orchestration tools (e.g., Airflow, Prefect, or Dagster) and data transformation frameworks like dbt.
- Cloud Ecosystem: Practical experience with cloud data warehouses and services, specifically within the Azure ecosystem (Azure Data Factory, Synapse, Databricks, or ADLS).
- AI-Native Proficiency: Proven experience using Generative AI tools (Claude Code, GitHub Copilot) to accelerate coding, migrations, and documentation.
- Data Quality & Testing: Solid understanding of data testing principles and experience implementing automated Data Quality Gates.
- Methodology: Familiarity with Spec-Driven Development, where code is generated based on formalized requirements and pre-defined test cases.
Preferred Qualifications
- Agentic Workflows: Experience building or using agentic systems to automate data processing or metadata management.
- Big Data Frameworks: Experience with distributed processing at scale (e.g., Spark, Flink, or Kafka).
- DataOps: Experience with containerization (Docker, Kubernetes) and building CI/CD pipelines specifically for data workloads.
- Systematic Debugging: Ability to apply a structured approach to debugging non-linear errors in massive datasets.
Job Responsibilities
- Pipeline Orchestration: Design and implement scalable ETL/ELT processes, piloting AI agents to generate boilerplate code and optimize complex queries.
- Agentic Execution: Use Claude Code CLI to implement data transformations, automate schema migrations, and generate pipeline documentation.
- Deep Review: Conduct rigorous analysis of AI-generated code, focusing on data lineage, cost-efficiency of cloud resources, and security.
- Data Quality Assurance: Define and implement automated "Quality Gates" to ensure data integrity and correctness at every stage of the pipeline.
- Context Hygiene: Maintain the project’s operational memory (the CLAUDE.md file and /docs folder) to ensure AI agents remain effective and aligned with project goals.
- Skill Library Contribution: Help build and refine the pod’s "Skills" library for Claude Code, automating recurring data engineering patterns.
- Stakeholder Collaboration: Work closely with the AI Solution Architect to implement data models and with the AI Reliability Engineer to integrate data into the broader project ecosystem.
Department/Project Description
You will join a specialized AI-Native Delivery Pod focused on complex, high-stakes Data Engineering projects. Our mission is to build robust data infrastructures for enterprise clients using a Fixed Price model, delivering results with unprecedented speed through the synergy of human expertise and agentic execution. Projects typically involve building modern Data Warehouses (DWH), Analytics Lakes, and AI-ready data foundations within the Azure cloud. We operate in an environment of high autonomy where every engineer is an operator of cutting-edge AI technology.
Skill Category
AI/ML
Keyskills - Must Have
- Python
- claude
- Data Engineer
Required languages
| English | B2 - Upper Intermediate |
| Ukrainian | C1 - Advanced |