Data Engineer
Lead the development and scaling of our scientific knowledge graph—ingesting, structuring, and enriching massive datasets from research literature and global data sources into meaningful, AI-ready insights.
Requirements:
- Strong experience with knowledge graph design and implementation (Neo4j, RDFLib, GraphQL, etc.).
- Advanced Python for data engineering, ETL, and entity processing (Spark/Dask/Polars).
- Proven track record with large dataset ingestion (tens of millions of records).
- Familiarity with life-science or biomedical data (ontologies, research metadata, entity linking).
- Experience with Airflow/Dagster/dbt, and data APIs (OpenAlex, ORCID, PubMed).
- Strong sense of ownership, precision, and delivery mindset. Nice to Have:
- Domain knowledge in life sciences, biomedical research, or related data models.
- Experience integrating vector/semantic embeddings (Pinecone, FAISS, Weaviate).
We offer:
• Attractive financial package
• Challenging projects
• Professional & career growth
• Great atmosphere in a friendly small team
Required skills experience
| Neo4j | 4 years |
| GraphQL | 4 years |
| Python | 5 years |
| ETL | 4 years |
| Spark | 5 years |
| Airflow | 4 years |
Required languages
| English | B2 - Upper Intermediate |