Middle/Senior Data Engineer
Project: The client is developing an AI-powered investment intelligence platform to help venture capital firms and angel investors streamline startup discovery and deal-flow sourcing. The product aggregates and structures data from multiple sources, including accelerator programs, SEC Form D filings, and company profiles, allowing users to identify relevant investment opportunities more efficiently.
The platform includes features such as natural language alerts, automated sourcing workflows, and AI-driven insights. The solution is built with Next.js, Vercel, Clerk, and Stripe and is already live with paying customers while continuously expanding its automation and agentic UI capabilities.
Cooperation: Long-term.
Stage: Existing product, early-stage / actively growing.
Position: New role.
Tech Stack: Python, Supabase, GCP Cloud Run, OpenAI API, LLM Pipelines, Web Scraping, ETL.
Timezone Requirements: Possible 50/50 overlap Kyiv/New York (EST).
Location Requirements: Remote.
English: Advanced โ strong spoken English required.
Requirements:
- 4+ years of experience in data engineering
- Strong Python skills
- Experience with web scraping at scale (Playwright, Scrapy or similar)
- Hands-on experience with LLM API integration (OpenAI, Anthropic or similar)
- Experience building ETL / data pipelines
- GCP experience (Cloud Run or managed services)
- Supabase or PostgreSQL โ data storage and schema design
- Ability to work independently and own projects end-to-end
- Advanced spoken English
Responsibilities:
- Build new datasets โ structured scraping of accelerators, Form D filings, company profiles
- Parse and extract content using LLMs, define schemas in Supabase
- Create eval frameworks to benchmark LLM model performance across pipeline calls
- Design and improve scalable pipeline architecture on GCP
- Work independently within 2-week sprints with daily check-ins
Benefits from 8allocate:
- Team & Culture: Team events, offsites, and a culture that keeps people connected.
- Learning & Development: Budget for courses, certifications, and conferences.
- Wellbeing: Flexible support in line with company policy, with options to support your physical and mental wellbeing (sport, mental health, or medical insurance).
- Rest & Recovery: Paid vacation and sick leave.
Required skills experience
| Python | 3 years |
| ETL/ELT pipelines | 3 years |
| Web Scraping / Scraping | 3 years |
| GCP (Google Cloud Platform) | 2 years |
Required languages
| English | B2 - Upper Intermediate |