Python Backend Engineer

About the Project

 

Our client is a mission-driven, technology-focused non-profit organization founded by leading tech entrepreneurs.

The organization aims to tackle one of the biggest global challenges — online misinformation — by building innovative AI-driven trust infrastructure.

 

It operates with the speed and technical rigor of a high-growth startup, combining social impact with deep engineering work.

The team is developing a trust signal platform that helps AI systems evaluate which data sources are reliable.

 

The first product focuses on Wikipedia, analyzing its entire content to generate trust signals (such as edit stability, author credibility, and conflict history) and exposing them via an API that large AI platforms can query before using the information.

 

Future plans include expanding to other major data sources like Reddit, scientific papers, and online news.

 

Position: Backend Engineer - Data Infrastructure & Web Scraping

 

We are looking for a Backend Engineer for our client — a specialist with deep expertise in web scraping and data infrastructure, to design and build the foundational systems powering the trust platform.

 

You'll be responsible for designing and implementing scalable pipelines that collect, process, and serve data from complex platforms at enterprise scale.

 

Your Challenge — First 3-6 Months

 

  • Build a proof of concept capable of collecting and processing data from sources such as Wikipedia and Reddit.
  • Design the first data collection, storage, and low-latency processing pipelines.
  • Contributing to defining the architectural model of the B2B API for strategic partners.

     

Your Challenge — First Year and Beyond

 

  • Scale the framework to include new data sources digital books, newspapers, scientific articles, and paywalled content).
  • Structure a metadata and statistics layer that allows AI platforms to decide in real time which sources to trust.
  • Build the first robust commercial API for global AI and technology companies.

     

Day-to-Day Responsibilities

 

  • Develop large-scale scraping routines with focus on performance, accuracy, and source accessibility.
  • Build and optimize data processing pipelines in Python, designing efficient SQL data models.
  • Work within AWS (cloud-native), with flexibility for other clouds.
  • Participate in architectural discussions and product decisions with the founding team.
  • Apply creativity and critical thinking to solve unprecedented problems without ready-made playbooks.

     

Required Qualifications

 

Web Scraping Expertise

  • Proven experience scraping large, complex platforms (Wikipedia, Reddit, etc.).
  • Ability to handle multi-level link structures, edge cases, and partial-access restrictions.
  • Experience bypassing anti-scraping measures and processing various data formats.

     

Backend Development

  • Strong Python skills for backend scripting and automation.
  • Experience parsing, cleaning, and structuring data from APIs and scraped sources.
  • Accuracy and completeness when preparing large datasets.

     

Data & Infrastructure

  • Proficiency in SQL for querying, transforming, and modeling data.
  • Experience designing scalable data pipelines for collection, storage, and retrieval.
  • Handling large datasets with low latency and high precision.

     

API Development

  • Experience building B2B APIs that expose processed datasets with layered metadata.
  • Understanding of performance optimization for enterprise-level APIs.

     

Cloud Infrastructure

  • Experience working in AWS (adaptable to other cloud platforms).
  • Understanding of cloud-native architecture and security best practices.

     

Nice to Have

 

  • Familiarity with LLMs and Prompt Engineering.
  • Experience with RPA or data collection automation.
  • Knowledge of JavaScript for integrations.
  • Understanding of distributed architecture and resilient system design.
  • Background in NLP or statistical modeling for trust scoring.

     

Why You’ll Love It Here

 

  • Build technology that directly improves the integrity of global information.
  • Work alongside some of the world’s most successful tech founders and receive direct mentorship.
  • Fully remote, high-impact work environment with startup velocity and autonomy.
  • Your engineering work will influence how AI systems worldwide determine truth.

Required skills experience

Python 5 years
API development 5 years
AWS 3 years
Web Scraping / Scraping 3 years
ETL/ELT pipelines 3 years
SQL 5 years

Required languages

English B2 - Upper Intermediate
NLP, LLM, Prompt Engineering
Published 6 October · Updated 3 November
Statistics:
156 views
·
23 applications
100% read
To apply for this and other jobs on Djinni login or signup.
Loading...