Senior Data Engineer – (PySpark / Data Infrastructure)

Senior Data Engineer – (PySpark / Data Infrastructure)

We're hiring a Senior Data Engineer to help lead the next phase of our data platform’s growth.

At Forecasa, we provide enriched real estate transaction data and analytics to private lenders and investors. Our platform processes large volumes of public data, standardizes and enriches it, and delivers actionable insights that drive lending decisions.

We recently completed a migration from a legacy SQL-based ETL stack (PostgreSQL/dbt) to PySpark, and we're now looking for a senior engineer to take ownership of the new pipeline, maintain and optimize it, and develop new data-driven features to support our customers and internal analytics.

What You’ll Do

Own and maintain our PySpark-based data pipeline, ensuring stability, performance, and scalability.
Design and build new data ingestion, transformation, and validation workflows.
Optimize and monitor data jobs using Airflow, Kubernetes, and S3.
Collaborate with data analysts, product owners, and leadership to define data needs and deliver clean, high-quality data.
Support and mentor junior engineers working on scrapers, validation tools, and quality monitoring dashboards.
Contribute to the evolution of our data infrastructure and architectural decisions.

Our Tech Stack

Python • PySpark • PostgreSQL • dbt • Airflow • S3 • Kubernetes • GitLab • Grafana

What We’re Looking For

5+ years of experience in data engineering or backend systems with large-scale data processing.
Strong experience with PySpark, including building scalable data pipelines and working with large datasets.
Solid command of SQL, data modeling, and performance tuning (especially in PostgreSQL).
Experience working with orchestration tools like Airflow, and containers via Docker/Kubernetes.
Familiarity with cloud storage (preferably S3) and modern CI/CD workflows.
Ability to work independently and communicate clearly in a remote, async-first environment.

Bonus Points

Background in real estate or financial data
Experience with data quality frameworks or observability tools (e.g., Great Expectations, Grafana, Prometheus)
Experience optimizing PySpark jobs for performance and cost-efficiency

Required skills experience

PySpark	4 years
Python	4 years
ETL	4 years
Airflow	2 years
Kubernetes	2 years

+ 2 more

Grafana	2 years
Data Science	1 year

Required languages

English

C1 - Advanced

PySpark, Python, ETL, Airflow, Kubernetes, Grafana, Data Science/Machine Learning

Published 4 April · Updated 27 November

Statistics:

24 views

3 applications

67% read

To apply for this and other jobs on Djinni login or signup.

from 5 years of experience

Considering with 4 years of experience
Full Remote
Worldwide
Countries where we consider candidates
English C1 - Advanced

Data Engineer

PySpark	4 years
Python	4 years
ETL	4 years

+ 4 more

Employment: Fulltime
Domain: Fintech
Product

Apply for the job

67% read

0% responded

📊 Average salary range of similar jobs in analytics →