Senior Data Engineer
We are looking for a talented Senior Data Engineer to join our client’s team. The main objective of the project is to implement clinical solutions for drug development and testing. The process covers all stages up to drug manufacturing, and the solution is designed in a modular fashion.
As a Senior Data Engineer, you will lead the end-to-end design and implementation of the Data Surveillance Platform on Databricks, ensuring alignment with operational, analytical, and regulatory requirements. You will define the data lakehouse architecture, develop ingestion and transformation pipelines using Python and Databricks workflows, and collaborate closely with data engineers, analysts, and stakeholders.
This is you
- 5+ years of experience as a Data Engineer
- Strong expertise in Azure cloud services
- Understanding of modern Data Platforms and Data Lakes
- Strong knowledge of SQL and database management
- Proficiency in Python and PySpark
- English — Upper-intermediate or higher
Nice-to-have skills:
- Experience with building AI\ML on top of Databricks
Experience in the Healthcare domain
This is your role
- Design and develop highly scalable data ingestion and processing frameworks to transform a variety of datasets, capture metadata, lineage and implement data quality
- Lead the end-to-end design and implementation of the Data Surveillance Platform on Databricks, ensuring the architecture aligns with operational, analytical, and regulatory requirements
- Define the data lakehouse structure, covering the bronze, custom client, silver, and gold layers, and establish standards for data modeling, schema evolution, and data governance
- Design and build ingestion and transformation pipelines using Python (PySpark) and Databricks workflows, implementing automated data harmonization, quality validation, and mapping logic
- Develop and enforce audit trail mechanisms, ensuring all data processing is versioned, traceable
- Assemble large and complex data sets to meet functional and non-functional business requirements
- Identify opportunities for internal process improvements, such as automating manual processes, optimizing data delivery, infrastructure scalability
- Optimize the performance of queries and data processing
- Monitor the health and performance of queries
- Collaborate with data engineers, data analysts, and other stakeholders to understand their requirements, provide data solutions
Required skills experience
| Azure | 4 years |
| PySpark | 4 years |
| Datalake | 3.5 years |
| SQL початковий рівень | 3.5 years |
Required languages
| English | B2 - Upper Intermediate |