Senior Data Engineer Offline

Altss is redefining investor intelligence. We aggregate OSINT-verified insights—from public filings to niche industry sources—so asset-management teams can act on true real-time signals. Our data backbone spans AWS, Azure, and GCP, with Prefect-orchestrated pipelines feeding a Postgres core and columnar lakehouse. As our senior data engineer, you will own and scale that backbone as we push toward 100 M+ contacts and billions of relationships.

Who You Are

A builder at heart who enjoys turning ambiguity into elegant, reliable systems.
A data craftsperson who sweats schema design, partitioning strategy, and pipeline efficiency.
An ownership-driven engineer comfortable running infrastructure when it affects data quality or uptime.
A pragmatic communicator who documents clearly and keeps distributed teams aligned.
A mentor who raises the bar through thoughtful code reviews and shared standards.

What You’ll Do

Design and operate high-throughput ETL/ELT and streaming pipelines that ingest, deduplicate, and enrich OSINT sources.
Model entities—people, firms, funds—and implement deterministic and probabilistic matching to build a clean relationship graph.
Define storage strategies across Postgres, S3/GCS object stores, and analytical engines such as Parquet or Iceberg.
Implement data-quality gates, lineage tracking, and automated regression tests.
Establish monitoring, alerting, and disaster-recovery for a 24/7 product, leveraging Terraform, Docker, and Kubernetes where useful.
Collaborate with parsing, product, and front-end teams to ship customer-facing features on schedule.
Optimise cost and performance as workloads shift among AWS, Azure, and GCP.

Must-Have Experience

5+ years of Python building production data pipelines or micro-services.
Hands-on orchestration with Prefect, Airflow, Dagster, or similar, running large DAGs (10 K+ tasks/day).
Deep knowledge of Postgres tuning and partitioning, plus experience with at least one columnar warehouse (e.g., Redshift, BigQuery).
Practical experience with Kafka, Pulsar, or Kinesis in multi-region setups.
Proficiency with Terraform or CDK and day-to-day Linux, Docker, and Kubernetes.
Track record of shipping data products that handle billions of rows or terabytes of traffic.
Strong written and spoken English with working hours overlapping Europe/Eastern US time zones.

Nice-to-Have

Experience mining public registries, regulatory filings, or other hard-to-reach OSINT sources.
Familiarity with graph databases such as Neo4j or ArangoDB, or large-scale entity-resolution techniques.
Background in finance, compliance, or other data-sensitive domains (SOC 2, GDPR).
Exposure to cost-control strategies in multi-cloud environments.

Why Altss

Immediate impact: your code powers critical decisions at leading asset-management firms from day one.
Remote-first, async culture with minimal bureaucracy and rapid decision cycles.
Autonomy: freedom to choose the tools and processes that keep the data flowing smoothly.

Prefect, Python, Apache Airflow, ETL/ELT

The job ad is no longer active

Look at the current jobs Data Engineer →

📊 $4000-6000 Average salary range of similar jobs in analytics →