Performance Engineer (Data Platform / Databricks)
We are looking for a specialist to design and implement an end-to-end performance testing framework for a healthcare system running on Databricks and Microsoft Azure. You will build a repeatable, automated approach to measure and improve performance across data ingestion, ETL/ELT pipelines, Spark workloads, serving layers, APIs, security/identity flows, integration components, and presentation/UI, while meeting healthcare-grade security and compliance expectations.
This role sits at the intersection of performance engineering, cloud architecture, and test automation, with strong attention to regulated-domain requirements (privacy, auditability, access controls).
Key Responsibilities
- Design and build a performance testing strategy and framework for a Databricks + Azure healthcare platform.
- Define performance KPIs/SLOs (e.g., pipeline latency, throughput, job duration, cluster utilization, cost per run, data freshness).
- Create workload models that reflect production usage (batch, streaming, peak loads, concurrency, backfills).
- Create a test taxonomy: smoke perf, baseline benchmarks, load, stress, soak/endurance, spike tests, and capacity planning.
- Implement automated performance test suites for:
- Databricks jobs/workflows (Workflows, Jobs API)
- Spark/Delta Lake operations (reads/writes, mergers, compaction, Z-Ordering where relevant)
- Data ingestion (ADF, Event Hubs, ADLS Gen2, Autoloader, etc. as applicable)
- Build test data generation and data anonymization/synthetic data approaches suitable for healthcare contexts.
- Instrument, collect, and analyze metrics from:
- Spark UI / event logs
- Databricks metrics and system tables
- Azure Monitor / Log Analytics
- Application logs and telemetry (if applicable)
- Produce actionable performance reports and dashboards (trend, regression detection, run-to-run comparability).
- Create performance tests for key user journeys (page load, search, dashboards) using appropriate tooling.
- Measure client-side and network timings and correlate them with API/backend performance.
- Integrate performance tests into CI/CD (Azure DevOps or GitHub Actions), including gating rules and baselines.
- Document framework usage, standards, and provide enablement to engineering teams.
Required Qualifications
- Proven experience building performance testing frameworks (not just executing tests), ideally for data platforms.
- Strong hands-on expertise with Databricks and Apache Spark performance tuning and troubleshooting.
- Strong knowledge of Azure services used in data platforms (commonly ADLS Gen2, ADF, Key Vault, Azure Monitor/Log Analytics; others as relevant).
- Strong programming/scripting ability in Python and/or Java/TypeScript.
- Familiarity with load/performance tools and approaches (e.g., custom harnesses, Locust/JMeter/k6 where appropriate, or Spark-specific benchmarking).
- Ability to design repeatable benchmarking (baseline creation, environment parity, noise reduction, statistical comparison).
- Understanding of data security and compliance needs typical for healthcare (e.g., HIPAA-like controls, access management, auditability; adapt to your jurisdiction).
- High-level proficiency in English
Nice-to-Have / Preferred
- Experience with Delta Lake optimization (OPTIMIZE, ZORDER, liquid clustering where applicable), streaming performance, and structured streaming.
- Experience with Terraform/IaC for reproducible test environments.
- Knowledge of Unity Catalog, data governance, and fine-grained access controls.
- Experience with OpenTelemetry tracing and correlation across UI โ API โ data workloads.
- FinOps mindset: performance improvements tied to cost efficiency on Databricks/Azure.
- Prior work on regulated domains (healthcare, pharma, insurance).
Working Model
- Contract
- Remote
- Collaboration with Data Engineering, Platform Engineering, Security/Compliance, and Product teams.
Required languages
| English | B2 - Upper Intermediate |