Forecasa

Joined in 2021

Data Scientist / Quantitative Risk Analyst

Full Remote · Worldwide · 4 years of experience · English - C2 · Fintech

Product

About Forecasa Forecasa is a profitable, founder‑led SaaS company that turns raw real‑estate transaction data into decision‑grade intelligence for hedge funds, private‑lenders, and MBS desks. We move fast, value autonomy with accountability, and maintain a culture where clear documentation beats hierarchy. What you’ll do Engineer risk‑focused features (borrower, lender, property, geography) in Python/PySpark. Develop and validate PD / LGD models using WoE, IV, logistic GBM, XGBoost, or similar. Prototype...

About Forecasa

Forecasa is a profitable, founder‑led SaaS company that turns raw real‑estate transaction data into decision‑grade intelligence for hedge funds, private‑lenders, and MBS desks. We move fast, value autonomy with accountability, and maintain a culture where clear documentation beats hierarchy.

What you’ll do

Engineer risk‑focused features (borrower, lender, property, geography) in Python/PySpark.
Develop and validate PD / LGD models using WoE, IV, logistic GBM, XGBoost, or similar.
Prototype lender‑health metrics (capital‑diversification, portfolio turnover, market concentration, etc.) for client dashboards.
Create robust, reproducible data pipelines (git‑versioned, unit‑tested, CI in GitLab).
Produce concise notebooks & dashboards that can feed automated PDF reports.

Must‑have qualifications

4 – 6+ years in data science, risk analytics, or credit‑modeling.
Strong Python (pandas, NumPy, scikit‑learn) and SQL; solid PySpark on distributed data a big plus.
Hands‑on experience building or validating credit‑risk or fraud models (PD, scorecards, Basel/IFRS 9, etc.).
Fluency in statistics (inferential tests, multicollinearity, model monitoring).
Git workflow, code review discipline, and comfort with Agile/Kanban boards.
Clear written & spoken English; able to summarize findings for non‑technical stakeholders.

Nice‑to‑haves

Familiarity with U.S. mortgage or private‑lending data.
Experience with Postgres, MinIO/S3, or dbt.
Knowledge of BI/visualization tools (Plotly, PowerBI, Looker, etc).
Prior work in a fully remote, internationally‑distributed team.

How we work

Stack: Python • PySpark • PostgreSQL/Snowflake • GitLab CI • AWS & on‑prem Spark
Communication: Slack, Zoom, Notion. Meetings kept lean; deliverables drive the schedule.
Culture: Low‑ego, high‑ownership. We favor clarity, rapid feedback loops, and well‑documented processes.

106 views · 22 applications · 30d

Data Engineer – Web Scraping/Data Quality (AI-Augmented)

Forecasa

$$$

Full Remote · Worldwide · 1 year of experience · English - C2 · Fintech

Product

We're looking for a sharp, curious, and driven Data Engineer to join Forecasa, a U.S.-based data startup delivering high-quality real estate data and analytics to lenders and investors. You'll be part of our Data Acquisition & Quality team, helping scale and improve the systems that collect, validate, and monitor the data powering our platform. We've built serious AI-augmented development workflows internally - Claude Code, autonomous agents, GitLab-based orchestration - and we're looking for engineers...

We're looking for a sharp, curious, and driven Data Engineer to join Forecasa, a U.S.-based data startup delivering high-quality real estate data and analytics to lenders and investors.

You'll be part of our Data Acquisition & Quality team, helping scale and improve the systems that collect, validate, and monitor the data powering our platform. We've built serious AI-augmented development workflows internally - Claude Code, autonomous agents, GitLab-based orchestration - and we're looking for engineers who've already formed their own opinions about how to work effectively with these tools.

What You'll Do

Develop and maintain Python-based web scrapers, using AI coding assistants to accelerate development while applying your judgment on reliability and edge cases
Use tools like Selenium, BeautifulSoup, Pandas, and PySpark to extract and normalize data efficiently
Package scrapers as Docker containers and deploy them to Kubernetes
Create and manage Airflow DAGs to orchestrate scraping pipelines
Build data validation pipelines to catch anomalies, missing values, and inconsistencies
Review and refine AI-generated code for production reliability
Set up Grafana dashboards to monitor pipeline health and data quality metrics

Our Tech Stack

Python · PySpark · Selenium · Airflow · Pandas · Postgres · S3 · Docker · Kubernetes · GitLab · Grafana · Claude Code

What We're Looking For

Solid Python experience, especially building web scrapers
Familiarity with Selenium, BeautifulSoup, or Scrapy
Some experience with Docker, Airflow, or other orchestration tools
Active use of AI coding tools (Claude Code, Cursor, Copilot, etc.) with opinions about what works and what doesn't - we want to hear about your preferred workflows
Strong code review instincts - you can spot issues in code whether you wrote it or an AI did
A resourceful, problem-solving mindset - not afraid to dig into a messy site or debug a flaky scraper

Bonus Points For

Experience with Grafana or Prometheus for monitoring
Exposure to cloud platforms (AWS preferred) and managing scrapers at scale
Familiarity with CI/CD and Git workflows (we use GitLab)

About Us

Forecasa delivers enriched real estate transaction data to private lenders and institutional investors. We're a small, fast-moving team with a strong engineering culture. We've invested heavily in AI-augmented development - autonomous coding agents, GitLab orchestration - and we're looking for people who are already bought in on this direction, not people we need to convince.

Location

Remote – we welcome candidates from anywhere in the world.

To Apply

Tell us about your current AI coding workflow - what tools you use, what you've learned, what you'd do differently. Generic applications without this will be deprioritized.

207 views · 27 applications · 30d

Jobs feed in RSS

Forecasa

Search by position

(with all words)

Advanced search

With any of the words

Exclude words