Data Engineer - Quality Assurance

$$$$

Key Responsibilities:

  • Automate data quality and reconciliation checks across varied storage layers, including Snowflake, SQL, and RDF/SPARQL databases
  • Test and verify data lineage, governance, and visualization components using Snowflake, data catalogs (ie. DataHub), Thoughtspot, and other visualization tools
  • Integrate test suites into the core infrastructure orchestrated by Apache Airflow and utilizing Iceberg table formats, while monitoring data pipeline health, alerting, and observability metrics using Prometheus and Grafana Cloud
  • Establish AI Evaluation Loops (Evals) and Guardrails: Build rigorous verification protocols— including structural tests, checks, and watchdog agents—to validate AI-generated artifacts, catch false positives, and ensure all automated outputs are secure, reliable, and free from hallucinations.
  • Integrate automated testing workflows into CI/CD pipelines using GitHub Actions, ensuring continuous stability and quality gates across all deployment environments
  • Validate ETL and dbt transformations across Data Lakehouses, rigorously testing data progression through a Medallion Architecture
  • Test and automate complex API workflows, validating data payloads across OpenAPI integrations, 3rd party APIs, GraphQL, and AWS APIs (specifically S3)

Must Haves:

  • Data engineering & data testing: dbt, data lakehouse concepts, Medallion architecture
  • Databases & storage testing: SQL, Snowflake, AWS S3, Iceberg
  • Integrating quality check into data pipelines: Apache Airflow
  • API testing & automation: REST/OpenAPI, GraphQL
  • Integrating test automation into CI/CD: GitHub Actions (or similar like ArgoCD/GoCD)
  • Cloud / infrastructure and observability basics: Kubernetes (K8s), Prometheus, Grafana

Nice to Have:

  • Graph databases: RDF / SPARQL
  • Data governance & analytics tools: DataHub, Thoughtspot
  • AI/ML testing & MLOps: AI evals, guardrails, RAG, vector databases, AI drift monitoring
  • Advanced / emerging data tech: StarRocks, DuckDB
  • Regulated environments: GxP, 21 CFR Part 11, HIPAA
  • Clinical / domain-specific data standards: CDISC, ODM, FHIR
  • AI-native tooling: Cursor, Claude Code, Copilot, QA Wolf

Required languages

English B2 - Upper Intermediate
Data Engineer, Data Quality, Quality Assurance, Python, Snowflake, Apache Airflow, Prometheus+Grafana, SQL, AWS S3, Kubernetes
Published 16 June
15 views
·
1 application
To apply for this and other jobs on Djinni login or signup.
Loading...