Lead Data Engineer $$$$ Offline

Viders Group

A Lead Data Engineer role focused on building the data backbone behind large-scale scientific knowledge systems - including ingestion, transformation, semantic modeling, and high-efficiency access layers.

This position centers around the creation of an AI-ready knowledge graph, connecting research papers, ontology data, experts, and institutions through scalable pipelines and vector-based retrieval.

Role Objective

Ownership of data architecture and pipeline scalability, with direct responsibility for designing a fast, reliable, and extensible system capable of processing scientific data at scale. The role involves shaping technical direction, implementing distributed ingestion workloads, and enabling AI-driven insights through modern graph and vector data technologies.

Key Responsibilities

Design and maintain scalable, end-to-end data pipelines for ingestion and processing
Build and optimize a scientific knowledge graph and related taxonomies
Implement semantic search and vector indexing workflows
Create and evolve a unified data access layer for internal consumption
Leverage AI tools for acceleration across ideation, automation, and delivery
Apply a fast-iteration delivery approach with strong documentation habits
Work cross-functionally with AI, engineering, and product stakeholders

Technical Requirements

Strong Python skills for ETL, manipulation, and graph workflows
Experience with distributed computation (Spark, Dask, Polars, or equivalents)
Knowledge graph design and graph databases (Neo4j, RDFLib, property graphs)
Familiarity with vector database technologies (FAISS, Pinecone, Qdrant, Weaviate)
Experience with ETL orchestration (Airflow, Dagster, dbt, or custom)
Ability to work with formats including Parquet, JSONL, CSV, RDF, Turtle
Experience working with public research data APIs (OpenAlex, ORCID, PubMed) is a plus

Expected Approach & Mindset

System-level thinking with a preference for simplicity and speed
Independent work discipline with clear communication practices
Comfort working with incomplete, inconsistent scientific datasets
Execution-driven mindset - shipping over perfection
Ability to collaborate across disciplines without heavy process overhead

Work Philosophy

This role suits engineers who enjoy technical ownership, autonomy, and building modern, AI-integrated data systems from the ground up. The environment favors curiosity, pragmatic decision-making, high transparency, and rapid iteration cycles instead of rigid process structures.

Why This Role Matters

The resulting platform directly contributes to accelerating scientific discovery by transforming raw research data into searchable, interconnected knowledge accessible through modern AI systems. The impact is measurable, broad, and meaningful.

Required languages

English

C1 - Advanced

The job ad is no longer active

Look at the current jobs Data Engineer →

Only from 6 years of experience
Full Remote
Worldwide
Countries where we consider candidates
- English C1 - Advanced

Data Engineer

Employment: Fulltime
Domain: Machine Learning / Big Data
Product

Apply for the job

📊 Average salary range of similar jobs in analytics →