Junior Data Engineer IRC262232
Description
The Digital Health organization is technology team which focused on next generation Digital Health capabilities which deliver on the Medicine mission and vision to deliver Insight Driven Care. This role will operate within the Digital Health Applications & Interoperability subgroup of the broader Digital Health team, focused on patient engagement, care coordination, AI, healthcare analytics & interoperability amongst other advanced technologies which enhance our product portfolio with new services, while improving clinical & patient experiences.
The project is a cloud-based PaaS Ecosystem built with a privacy by design centric approach to provide a centralized cloud-based platform to store, classify, and control access to federated datasets in a scalable, secure, and efficient manner.
The ecosystem will allow Customer Operating Units (medical device departments) to store federated data sets of varying sizes and formats and control access to those data sets through Data steward(s). Source data sets can be exposed to qualified use cases and workflows through different project types.
The Healthcare Data Platform ecosystem will provide ML/AI project capabilities for streamlined development processes and a ML/AI workbench to enhance data exploration, wrangling, and model training.
In queue: 15+ OU’s. At this moment focused on – Nuero, Cardio, Diabetes is the OU that data platform is working with, but there could be more OU’s coming up with requirements in future.
GL Role: is to work on the enhancement of current capabilities, including taking over the work that AWS proserve team is doing, and develop new requirements that will keep coming from different OU’s in the future
Requirements
Python, Data Engineering, Data Lake or Lakehouse, Apache Iceberg (nice to have), Parquet
Good communication skills, pro-active/initiative
MUST HAVE
- AWS Platform: Working experience with AWS data technologies, including S3, AWS RDS, Lake Formation
- Programming Languages: Strong programming skills in Python
- Data Formats: Experience with JSON, XML and other relevant data formats
- CI/CD Tools: Ability to deploy using established CI/CD pipelines using GitLab CI, Jenkins, Terraform or similar tools
- Scripting and automation: experience in scripting language such as Python, PowerShell, etc…
- Monitoring and Logging: Familiarity with monitoring & logging tools like CloudWatch, Splunk, ELK, Dynatrace, Prometheus
- Source Code Management: Expertise with GitLab
- Documentation: Experience with markdown and in particular Antora for creating technical documentation
NICE TO HAVE
- Previous Healthcare or Medical Device experience
- Experience implementating enterprise grade cyber security & privacy by design into software products
- Experience working in Digital Health software
- Experience developing global applications
- Strong understanding of SDLC; experience with Agile methodologies
- Software estimation
- Experience leading software development teams onshore and offshore
- Experience with FHIR
Job responsibilities
- Implement data pipelines using AWS services such as AWS Glue, Lambda, Kinesis, etc
- Implement integrations between the data platform and systems such as Atlan, Trino/Starburst, etc
- Complete logging and monitoring tasks through AWS and Splunk toolsets
- Develop and maintain ETL processes to ingest, clean, transform and store healthcare data from various sources
- Optimize data storage solutions using Amazon S3, AWS RDS, Lake Formation and other AWS technologies.
- Document, configure, and maintain systems specifications that conform to defined architecture standards, address business requirements, and processes in the cloud development & engineering.
- Participate in planning of system and development deployment as well as responsible for meeting compliance and security standards.
- Actively identify system functionality or performance deficiencies, execute changes to existing systems, and test functionality of the system to correct deficiencies and maintain more effective data handling, data integrity, conversion, input/output requirements, and storage.
- Document testing and maintenance of system updates, modifications, and configurations.
- Leverage platform process expertise to assess if existing standard platform functionality will solve a business problem or customization solution would be required.
- Test the quality of a product and its ability to perform a task or solve a problems.
- Perform basic maintenance and performance optimization procedures in each of the primary operating systems.
- Ensure system implementation compliance with global & local regulatory and security standards (i.e. HIPAA, SOCII, ISO27001, etc…)