Senior Data Engineer (US-Based Product ,Real-Time Data Platform)

About the Product

We are building a US-based, data-driven product with a strong focus on scalability, performance, and cost efficiency.

Our mission is to design a modern data platform that transforms raw behavioral and monetization data into reliable, actionable business insights — in near real-time.

For us, data engineering is not just about moving data.
It’s about:

Designing resilient architecture
Optimizing for performance and cost
Building reliable automation
Ensuring architectural integrity at scale

Role Overview

We are looking for a Senior Data Engineer who will take ownership of the data platform architecture and drive technical excellence across ingestion, modeling, and performance optimization.

This role requires deep expertise in SQL, Python, AWS infrastructure, and modern data stack principles. You will not only build pipelines — you will define standards, lead architectural decisions, and proactively improve system efficiency.

You will play a critical role in ensuring that data flows seamlessly from event streams to business-ready datasets while maintaining high performance, reliability, and cost control.

What Makes This Role Senior-Level

As a Senior Data Engineer, you will:

Own architectural decisions for the data platform
Identify scalability bottlenecks before they become incidents
Optimize data infrastructure for performance and cost
Lead technical code reviews and set engineering standards
Mentor mid-level engineers
Act as a technical partner to Product and Analytics stakeholders
Balance real-time and batch processing strategies strategically

Technical Requirements

Must-Have

Expert-Level SQL

Complex analytical queries and window functions
Query optimization and execution plan analysis
Identifying and eliminating performance bottlenecks
Reducing query complexity and compute costs
Designing partitioning and clustering strategies

Python

Advanced data manipulation
Building scalable ETL/ELT frameworks
Writing production-grade data services
Automation and monitoring scripts

AWS Core Infrastructure

AWS Kinesis Firehose (near-real-time data streaming)
Amazon S3 (data lake architecture and storage optimization)
Designing reliable ingestion layers

Version Control

Git (GitHub / GitLab)
Branching strategies
Leading technical code reviews
Enforcing best practices in code quality

Nice-to-Have

Modern Data Stack

dbt (modular SQL modeling, documentation, testing)
Experience structuring layered data models (staging → intermediate → marts)

Data Warehousing

Google BigQuery
Slot management
Cost-efficient querying
Storage and compute optimization

Advanced Optimization Techniques

Partitioning
Clustering
Bucketing
Storage layout optimization

Integrations & Infrastructure

Salesforce data integration
Docker / ECS
CI/CD for data workflows

AI / ML Exposure

Supporting feature pipelines
Understanding data requirements for ML systems

Key Responsibilities

Data Platform Architecture

Design and maintain a scalable real-time and batch data platform
Architect ingestion pipelines using AWS Kinesis and Python
Ensure high availability and reliability of data flows

Real-Time Processing

Enable near-real-time (seconds–minutes latency) data processing
Build systems for operational alerting and anomaly detection
Ensure early detection of monetization and traffic issues

Data Modeling

Transform raw event data into business-ready datasets using dbt
Design scalable, maintainable schemas aligned with product evolution

Performance & Cost Engineering

Optimize SQL queries and storage structures
Design cost-efficient partitioning strategies
Monitor and reduce warehouse and infrastructure costs
Balance real-time and batch processing appropriately

Engineering Excellence

Lead and participate in code reviews
Enforce high standards of performance, security, and maintainability
Improve observability and monitoring across pipelines

Cross-Functional Collaboration

Work closely with Data Analysts and Product Managers
Translate business requirements into scalable technical solutions
Clearly communicate trade-offs between speed, cost, and complexity

Type of Data We Process

User behavior events (page views, clicks, searches, conversions)
Ad & monetization events (impressions, clicks, CTR, attribution)
System and integration logs (latency, errors, rate limits)

Why Real-Time Is Critical

Detect broken ads or impression drops before revenue is lost
Identify traffic anomalies or abuse early
Enable same-day operational intervention
Prevent negative user and advertiser experience

Near-real-time (seconds to minutes latency) is required for operational awareness.
Batch processing remains important for historical analysis and reporting — but not for incident detection.

Working Schedule

Monday – Friday
16:00 – 00:00 Kyiv time
Full alignment with US-based stakeholders

What We Value

Strong ownership mindset
Strategic thinking about architecture
Focus on scalability, reliability, and cost efficiency
Proactive problem-solving
Clear communication with both technical and non-technical teams
Ability to think beyond “just making it work”

Required languages

English

B2 - Upper Intermediate

Published 18 February

32 views

5 applications

75% read

To apply for this and other jobs on Djinni login or signup.

Only from 5 years of experience
Full Remote
Countries of Europe or Ukraine
Countries where we consider candidates
- English B2 - Upper Intermediate

Data Engineer

Employment: Fulltime
Domain: Other
Product

Apply for the job

75% read

0% responded

📊 $3700-5500 Average salary range of similar jobs in analytics →