Senior Data Engineer Offline
Altss is redefining investor intelligence. We aggregate OSINT-verified insights—from public filings to niche industry sources—so asset-management teams can act on true real-time signals. Our data backbone spans AWS, Azure, and GCP, with Prefect-orchestrated pipelines feeding a Postgres core and columnar lakehouse. As our senior data engineer, you will own and scale that backbone as we push toward 100 M+ contacts and billions of relationships.
Who You Are
- A builder at heart who enjoys turning ambiguity into elegant, reliable systems.
- A data craftsperson who sweats schema design, partitioning strategy, and pipeline efficiency.
- An ownership-driven engineer comfortable running infrastructure when it affects data quality or uptime.
- A pragmatic communicator who documents clearly and keeps distributed teams aligned.
A mentor who raises the bar through thoughtful code reviews and shared standards.
What You’ll Do
- Design and operate high-throughput ETL/ELT and streaming pipelines that ingest, deduplicate, and enrich OSINT sources.
- Model entities—people, firms, funds—and implement deterministic and probabilistic matching to build a clean relationship graph.
- Define storage strategies across Postgres, S3/GCS object stores, and analytical engines such as Parquet or Iceberg.
- Implement data-quality gates, lineage tracking, and automated regression tests.
- Establish monitoring, alerting, and disaster-recovery for a 24/7 product, leveraging Terraform, Docker, and Kubernetes where useful.
- Collaborate with parsing, product, and front-end teams to ship customer-facing features on schedule.
Optimise cost and performance as workloads shift among AWS, Azure, and GCP.
Must-Have Experience
- 5+ years of Python building production data pipelines or micro-services.
- Hands-on orchestration with Prefect, Airflow, Dagster, or similar, running large DAGs (10 K+ tasks/day).
- Deep knowledge of Postgres tuning and partitioning, plus experience with at least one columnar warehouse (e.g., Redshift, BigQuery).
- Practical experience with Kafka, Pulsar, or Kinesis in multi-region setups.
- Proficiency with Terraform or CDK and day-to-day Linux, Docker, and Kubernetes.
- Track record of shipping data products that handle billions of rows or terabytes of traffic.
Strong written and spoken English with working hours overlapping Europe/Eastern US time zones.
Nice-to-Have
- Experience mining public registries, regulatory filings, or other hard-to-reach OSINT sources.
- Familiarity with graph databases such as Neo4j or ArangoDB, or large-scale entity-resolution techniques.
- Background in finance, compliance, or other data-sensitive domains (SOC 2, GDPR).
Exposure to cost-control strategies in multi-cloud environments.
Why Altss
- Immediate impact: your code powers critical decisions at leading asset-management firms from day one.
- Remote-first, async culture with minimal bureaucracy and rapid decision cycles.
- Autonomy: freedom to choose the tools and processes that keep the data flowing smoothly.
The job ad is no longer active
Look at the current jobs Data Engineer →