Senior Data Engineer to $8000
We are building the future of healthcare analytics. Join us to design, build, and scale robust data pipelines that power nationwide analytics and support our machine learning systems. Our goal: pipelines that are reliable, observable, and continuously improving in production.
This is a fully remote role, open to candidates based in Europe or India, with periodic team gatherings in Mountain View, California.
What You’ll Do
- Design, build, and maintain scalable ETL pipelines using Python (Pandas, PySpark) and SQL, orchestrated with Airflow (MWAA).
- Develop and maintain the SAIVA Data Lake/Lakehouse on AWS, ensuring quality, governance, scalability, and accessibility.
- Run and optimize distributed data processing jobs with Spark on AWS EMR and/or EKS.
- Implement batch and streaming ingestion frameworks (APIs, databases, files, event streams).
- Enforce validation and quality checks to ensure reliable analytics and ML readiness.
- Monitor and troubleshoot pipelines with CloudWatch, integrating observability tools like Grafana, Prometheus, or Datadog.
- Automate infrastructure provisioning with Terraform, following AWS best practices.
- Manage SQL Server, PostgreSQL, and Snowflake integrations into the Lakehouse.
- Participate in an on-call rotation to support pipeline health and resolve incidents quickly.
- Write production-grade code, and contribute to design/code reviews and engineering best practices.
What We’re Looking For
- 5+ years in data engineering, ETL pipeline development, or data platform roles (flexible for exceptional candidates).
- Experience designing and operating data lakes or Lakehouse architectures on AWS (S3, Glue, Lake Formation, Delta Lake, Iceberg).
- Strong SQL skills with PostgreSQL, SQL Server, and at least one AWS cloud warehouse (Snowflake or Redshift).
- Proficiency in Python (Pandas, PySpark); Scala or Java a plus.
- Hands-on with Spark on AWS EMR and/or EKS for distributed processing.
- Strong background in Airflow (MWAA) for workflow orchestration.
- Expertise with AWS services: S3, Glue, Lambda, Athena, Step Functions, ECS, CloudWatch.
- Proficiency with Terraform for IaC; familiarity with Docker, ECS, and CI/CD pipelines.
- Experience building monitoring, validation, and alerting into pipelines with CloudWatch, Grafana, Prometheus, or Datadog.
- Strong communication skills and ability to collaborate with data scientists, analysts, and product teams.
- A track record of delivering production-ready, scalable AWS pipelines, not just prototypes.
Required languages
English | C2 - Proficient |
📊
$4000-6000
Average salary range of similar jobs in
analytics →
Loading...