Senior Infrastructure Engineer Offline

Altss

We’re building a next-generation intelligence platform that extracts, enriches, and organizes complex institutional and investor data from across the public web — turning it into structured insight using cutting-edge LLMs and scalable pipelines.

You’ll be the founding engineer responsible for owning the entire data infrastructure: from orchestration to enrichment, from pipeline reliability to model deployment. You’ll work directly with the founder and a dedicated LLM engineer to build the brain of the platform.

This is a hands-on, high-impact role for a senior engineer who moves fast and builds for scale — someone who can set up Prefect DAGs, productionize LangChain pipelines, handle massive parser output, and deploy open-source LLMs when needed.

What You’ll Do

Design and run Prefect 2.0 workflows for parsing, enrichment, and QA feedback loops
Own LangChain-based enrichment pipelines: vectorization, chunking, RAG, summarization, deduplication
Deploy, scale, and monitor open-source LLMs (e.g., Mistral, Zephyr) using tools like vLLM or TGI
Build and maintain high-reliability ingestion pipelines that handle unstructured inputs from 5M+ pages
Collaborate with the LLM engineer to structure training data and support fine-tuning workflows
Set up confidence scoring, retry logic, logging, and low-confidence QA routing
Manage performance and cost of GPU resources (Lambda Labs, HuggingFace, or AWS)
Build data integrity checks, schema validators, and batch loaders for Postgres and vector DBs
Help scale the system to support 1m+ investor profiles and daily data refresh cycles

What We’re Looking For

5–10+ years in data, backend, or ML infrastructure roles
Deep experience with Prefect 2.0 (or Airflow), orchestration, retries, and alerts
Strong command of LangChain, prompt pipelines, vector stores (Qdrant, Weaviate, FAISS)
Hands-on experience parsing large-scale web data (Playwright, Scrapy, Puppeteer, proxies, etc.)
Able to manage model deployment stacks (vLLM, TGI, Docker, inference serving)
Familiar with fine-tuning open-source models using Axolotl, HuggingFace, or PEFT
Fluent in Python, async workflows, and infrastructure that scales
Understands entity resolution, semantic deduplication, and QA scoring logic
Bonus: experience with graph-based entity networks or security-grade crawling systems

Why Join Us

You’ll own the full stack of one of the most technically ambitious data intelligence products in alt finance
You’ll work alongside a senior LLM engineer, 2 parser engineers, and QA — and ship every week
No bureaucracy, no fluff — real product, real adoption, real velocity
Remote, async-first team with deep venture + AI background
You’ll compete and outperform companies like PitchBook, Harmonic, and Fintrx — with 1/10th the headcount

The job ad is no longer active

Look at the current jobs Data Engineer →

📊 $4000-6000 Average salary range of similar jobs in analytics →