Data Engineer

Lead the development and scaling of our scientific knowledge graph—ingesting, structuring, and enriching massive datasets from research literature and global data sources into meaningful, AI-ready insights. 

 

Requirements: 

- Strong experience with knowledge graph design and implementation (Neo4j, RDFLib, GraphQL, etc.). 

- Advanced Python for data engineering, ETL, and entity processing (Spark/Dask/Polars). 

- Proven track record with large dataset ingestion (tens of millions of records). 

- Familiarity with life-science or biomedical data (ontologies, research metadata, entity linking). 

- Experience with Airflow/Dagster/dbt, and data APIs (OpenAlex, ORCID, PubMed). 

- Strong sense of ownership, precision, and delivery mindset. Nice to Have: 

- Domain knowledge in life sciences, biomedical research, or related data models. 

- Experience integrating vector/semantic embeddings (Pinecone, FAISS, Weaviate).

 

We offer:

• Attractive financial package

• Challenging projects

• Professional & career growth

• Great atmosphere in a friendly small team

Required skills experience

Neo4j 4 years
GraphQL 4 years
Python 5 years
ETL 4 years
Spark 5 years
Airflow 4 years

Required languages

English B2 - Upper Intermediate
ETL/ELT, Neo4j, RDFLib, Python, Spark, Airflow
Published 1 August 2024 · Updated 17 November
Statistics:
25 views
·
5 applications
20% read
·
20% responded
Last responded yesterday
To apply for this and other jobs on Djinni login or signup.
Loading...