Senior AI Data Engineer (Python) to $8000

Pulse Intelligence

About Pulse Intelligence

Pulse Intelligence is building the definitive data platform for the global mining industry. We aggregate, process, and enrich data from hundreds of sources (regulatory filings, stock exchanges, company websites, news, and financial APIs) to give mining investors and analysts a real-time, comprehensive view of every mining asset, company, and commodity on the planet.

Our platform combines large-scale web scraping with LLM-powered data extraction to turn unstructured documents (NI 43-101 technical reports, RNS announcements, SEDAR filings) into structured, queryable intelligence. We're a small team shipping fast, and every engineer has an outsized impact on the product.

About the Role

We're looking for a Senior AI Data Engineer to take ownership of our entire data pipeline, from raw document ingestion through AI-powered extraction to clean, structured records in our database. You'll be the technical lead on data acquisition and enrichment: architecting scrapers for new sources, designing LLM extraction strategies, making decisions on data modeling, and driving the quality and coverage of our mining asset database.

This is a high-autonomy role for someone who can see the big picture and execute on the details. You'll decide which data sources to prioritise, how to structure extraction pipelines, and when to invest in automation vs. manual curation. You'll ship scrapers one day, redesign an entity extraction pipeline the next, and mentor the team on best practices throughout.

What You'll Do

Own data acquisition and scraping - identify, prioritise, and build scrapers for new data sources (exchanges, regulatory filings, company websites, financial APIs) and scale them to run reliably in production
Design LLM extraction pipelines - architect and iterate on prompt-driven pipelines that extract structured mining data (assets, production, reserves, companies) from unstructured documents
Build the document processing pipeline - take raw PDFs, HTML, and filings from ingestion through to clean, structured data using OCR, parsing, deduplication, and text normalisation
Drive data quality and coverage - design verification, deduplication, and enrichment workflows, and own the data model that keeps our mining asset database accurate and well-structured
Keep pipelines running - monitor scheduled jobs, design for failure recovery, and ensure the system scales without manual intervention

What You Need

5+ years of Python in data engineering or backend development
Web scraping at scale - you've built and maintained production scrapers (Scrapy, Playwright, Selenium, or similar)
Prompt engineering - you've used LLM APIs (OpenAI, Anthropic, or similar) to extract structured data from unstructured text, and you iterate on prompts systematically
Strong SQL and data modeling - you've designed schemas and optimised queries in PostgreSQL or similar
Self-directed - you identify what needs doing and drive it to completion with minimal oversight

Nice to Haves

Mining or resources industry knowledge (NI 43-101, JORC, resource classifications)
AWS (S3, EKS) or similar cloud infrastructure
LLM self-verification, chain-of-thought, or agentic pipelines
Experience with workflow orchestration tools (Airflow, Dagster, or similar)
Experience mentoring engineers or leading a small data team

Benefits

Work on a product that maps the entire global mining industry
Small team - your work directly shapes the product
Remote-friendly with flexible hours
Equity in a growing platform

Hiring Process

Introductory call - 30 minutes
Take-home challenge - 6 hours
Technical & cultural fit interview - 1 hour
System design interview - 1 hour

Required skills experience

Python

5 years

Required languages

English

C2 - Proficient

SQL, AWS, Django, Celery, RabbitMQ, LLMs, Apache Airflow, Dagster

Published 20 February · Updated 5 March

183 views

29 applications

To apply for this and other jobs on Djinni login or signup.

from 5 years of experience

Considering with 4 years of experience
$6000-8000
Full Remote
Worldwide
Countries where we consider candidates
- English C2 - Proficient

Data Engineer

Python

5 years

Employment: Fulltime
Domain: Energy / Utilities
Product
Test task is needed

Apply for the job

📊 $4000-6000 Average salary range of similar jobs in analytics →