Senior Data Engineer (Python)

$$$$

Senior Data Engineer (Python)

Builds one of the world’s largest property intelligence datasets by processing satellite and street-level imagery across the entire United States.

We are looking for a Senior Python Data Engineer to design and operate large-scale data pipelines that generate property attributes from imagery using internal AI models.

This role focuses on building distributed processing systems running on Kubernetes clusters and on-prem GPU infrastructure, capable of processing massive geospatial datasets at national scale.

You will work closely with AI engineers and SaaS platform teams to build robust pipelines that transform imagery into structured property intelligence used by the insurance and financial industries.

 

Responsibilities

• Design and build large-scale Python processing pipelines for satellite and street-level imagery

• Develop distributed workloads running on Kubernetes clusters

• Build and maintain orchestration workflows using Airflow

• Design and manage data storage and processing workflows using PostgreSQL

• Process geospatial datasets using GeoPandas, GDAL, Shapely, or similar tools

• Integrate large-scale AI inference pipelines running on GPU infrastructure

• Optimize processing performance for massive imagery datasets

• Improve observability, monitoring, and reliability of production pipelines

• Collaborate with AI engineers to support large-scale computer vision inference workflows

• Integrate pipelines with AWS-based services

 

Requirements

• 7+ years of Python engineering experience

• Strong experience designing distributed data pipelines or large-scale data processing systems

• Experience running workloads on Kubernetes

• Hands-on experience with Airflow or similar orchestration frameworks

• Experience processing large datasets (imagery, geospatial, or other high-volume data)

• Strong understanding of ETL architectures and distributed data systems

 

Nice to Have

• Experience with geospatial processing (GeoPandas, GDAL, Rasterio, Shapely)

• Experience working with satellite imagery, aerial imagery, or remote sensing datasets

• Experience with GPU-based inference pipelines

• Experience with distributed processing frameworks such as Spark, Ray, or Dask

• Experience working with on-prem compute clusters

• Experience with C#

 

Why Join

You will be working on a platform that processes imagery data at national scale to generate property intelligence for the insurance and financial industries. The system analyzes hundreds of millions of properties using AI and geospatial processing, creating one of the most comprehensive property datasets in the world.

Required skills experience

Python 7 years
Distributed Data Pipelines / Big Data Systems 4 years
Kubernetes 4 years
Large-scale Data Processing 4 years
PostgreSQL 2 years
AWS 2 years
Observability and monitoring 3 years
AI/ML Inference Pipelines 1 year
C# 6 months
Data Pipelines - Big Data 4 years

Required languages

English C1 - Advanced
Geospatial Processing (GeoPandas), GDAL, Rasterio, Shapely, Satellite & Aerial Imagery, Remote Sensing, GPU Inference Pipelines, Distributed Processing (Spark), Ray, Dask
Published 4 May
24 views
·
3 applications
Last responded 45 minutes ago
To apply for this and other jobs on Djinni login or signup.
Loading...