Senior Data Engineer (Identity Domain)

We are looking for a Senior Data Engineer to join the Identity team at Equals5. This is not a standard ETL role. We are building a dynamic data ecosystem where AI is deeply integrated—both as a productivity multiplier and as a core component of our data processing logic for identity data enrichment and data scoring.

You will own the infrastructure that handles over 10,000 executions per minute, ensuring stability, scalability, and data integrity. You will work with a modern stack on Google Cloud Platform, utilizing Cloud Functions and Kubernetes.

We are looking for an engineer who improves infrastructure, automates everything, and is eager to implement LLM-based logic directly into high-load data flows.
 

Responsibilities

  • AI-Driven Data Scoring: Design and implement pipelines that utilize LLMs to analyze and score identity data in real-time. You will integrate AI models directly into the decision-making loop, balancing accuracy with latency and cost.
  • Own the Data Architecture: Architect scalable data solutions using GCP and Python. You will manage data storage and retrieval using BigQuery and Apache Iceberg to support querying of TBs of data.
  • Heavy Data Processing: Utilize Apache Spark for data transformations and batch processing when lightweight cloud functions are not enough.
  • Manage High-Load Orchestration: Maintain and optimize our system instances. It involves complex dataflows, custom Python nodes, and performance tuning for 10k+ execs/minute.
  • Release Lifecycle (CI/CD): Take ownership of the deployment process, ensuring that updates to pipelines and infrastructure are released safely with proper testing and rollback strategies.
  • Database Optimization: Manage PostgreSQL performance under heavy load, optimizing complex queries and indexing strategies.
  • Active AI Usage: Use Claude Code and other AI engineering tools to accelerate your own development, refactoring, and testing processes.
  • Incident Resolution: Proactively monitor the system. When alerts fire, you investigate the root cause—whether it’s a database lock or an LLM hallucination—and fix it permanently.
     

Requirements

  • 4-5+ years of experience in Data Engineering or Backend Engineering with a strong data focus.
  • Production AI Integration: Experience integrating LLMs (OpenAI, Anthropic, Gemini) into production applications via API. You understand latency, token limits, and how to structure data for AI scoring.
  • Expertise in GCP: Understanding of Google Cloud Platform (Cloud Functions, IAM, Networking).
  • Strong Python: You write clean, efficient, and testable code. You are comfortable building custom logic where standard tools fall short.
  • Big Data Stack: Experience with BigQueryApache Spark, and modern table formats like Apache Iceberg.
  • Kubernetes (K8s): Experience deploying and scaling services in containerized environments.
  • Workflow Automation: Understanding of workflow orchestration tools at a deep technical level. N8N is a big part of our domain, so familiarity with it is highly valuable.
  • PostgreSQL Mastery: Proven ability to handle heavy write/read loads and optimize schemas.
  • English: B2+ (Upper-Intermediate) or higher.
     

Culture & Mindset

  • Self-improvement: You are a fast learner. You don't fear AI replacing you; you master it to replace your manual tasks.
  • Ownership: You treat the Identity domain as your own business. If a scoring model drifts or a pipeline slows down, you notice it and fix it without being asked.
  • Internal Locus of Control: You take responsibility for outcomes. If an external API fails, you build a fallback mechanism instead of just blaming the provider.
  • Get It Done: You prioritize shipping value. You know when to use a simple script and when to build a complex architecture.
  • Openness: You share knowledge freely. If you find a better way to prompt the AI for scoring, you share it with the team.
     

What We Offer

  • Fully remote with flexible hours (aligned with EU timezones for syncs).
  • AI-Native Environment: We provide licenses for Claude Code and encourage using the bleeding edge of AI tech for both daily coding and product features.
  • High-Impact Role: You will directly influence how we identify and score users, impacting the core business logic.
  • Cross-functional visibility: Work closely with Product and Tech Leads to shape the future of Identity.
  • No Bureaucracy: Fast decisions, no legacy processes, focus on results.

 

Required skills experience

Python 4 years
GCP (Google Cloud Platform) 4 years
K8s 2 years
PostgreSQL 3 years
BigQuery 3 years

Required languages

English B2 - Upper Intermediate
Ukrainian Native
Published 20 February
64 views
·
2 applications
100% read
To apply for this and other jobs on Djinni login or signup.
Loading...