Junior Data Engineer – (Python/Web Scraping/Data Quality)

Junior Data Engineer – (Python/Web Scraping/Data Quality)
 

We’re looking for a sharp, curious, and driven Junior Data Engineer to join our team at Forecasa, a U.S.-based data startup focused on delivering high-quality real estate data and analytics to lenders and investors.

In this role, you’ll be part of our Data Acquisition & Quality team, helping us scale and improve the systems that collect, validate, and monitor the data that powers our platform.

What You’ll Do

  • Develop and maintain Python-based web scrapers to collect structured and unstructured data from various sources.
  • Use tools like Selenium, BeautifulSoup, and Pandas and Pyspark to extract and normalize data efficiently.
  • Package scrapers as Docker containers and deploy them to Kubernetes.
  • Create and manage Airflow DAGs to orchestrate and schedule scraping pipelines.
  • Build data validation pipelines to catch anomalies, missing values, and data inconsistencies.
  • Set up Grafana dashboards to monitor pipeline health and data quality metrics.
  • Collaborate with senior engineers to continuously improve scraper reliability, performance, and coverage.

Our Tech Stack

Python • PySpark • Selenium • Airflow • Pandas • Postgres • S3 • Docker • Kubernetes • GitLab • Grafana

What We're Looking For

  • Solid experience in Python, especially in building web scrapers.
  • Familiarity with libraries like Selenium, BeautifulSoup, or Scrapy.
  • Some experience with Docker, Airflow, or other workflow orchestration tools.
  • Basic understanding of data validation, data cleaning, and monitoring best practices.
  • A resourceful, problem-solving mindset — you’re not afraid to dig into a messy site or debug a flaky scraper.

Bonus Points For

  • Experience working with Grafana or Prometheus for monitoring.
  • Exposure to cloud platforms (AWS preferred) and managing scrapers at scale.
  • Familiarity with CI/CD and Git workflows (we use GitLab).

About Us

Forecasa is a U.S.-based startup delivering enriched real estate transaction data to private lenders and investors. We’re a small, fast-moving team with a strong engineering culture and a mission to bring clarity and transparency to a fragmented market.

Location

Remote – we welcome candidates from anywhere in the world.

 

NOTE: Please make all e-mails and communications through the djinni website. Thank you.

394 views
·
141 applications
81% read
·
79% responded
Last responded 9 hours ago
394 views
·
140 applications
81% read
·
79% responded
Last responded 9 hours ago
To apply for this and other jobs on Djinni login or signup.