Senior Data Engineer

$$$

A leading organization in the media streaming industry focuses on delivering innovative solutions that enhance user engagement and content accessibility. The company caters to a diverse audience, including individual consumers and content creators, providing them with advanced tools for content production and distribution. With a strong emphasis on technology, the organization leverages cutting-edge AI and machine learning capabilities to optimize user experiences and operational efficiency. Their commitment to continuous improvement and adaptation positions them as a significant player in the evolving media landscape.

We are looking for an experienced Data Engineer to participate in desing and architectute of client's data platform and GenAI initiatives. The primary focus of this role is building scalable distributed data processing systems that will serve as the foundation for the company's intelligent services.

Job Description

Required:

Bachelor’s/Master’s degree in Computer Science, Computer Engineering, Machine Learning, or a related field.
4+ years of experience in software development, focusing on Python programming and data engineering.
Strong expertise in Data Lake architecture and modern data warehousing.
Hands-on experience with distributed computing systems (Apache Spark is a must).
Strong Python proficiency for building complex pipelines and internal tooling.
Good understanding of distributed systems principles, real-time data processing, and batch processing.
Experience in scaling data infrastructure (AWS/GCP cloud solutions).
Proven experience in designing and scaling vector databases (e.g. Pinecone, Milvus, Weaviate, pgvector etc.), experience in developing recommendation systems and retrieval-augmented generation (RAG) pipelines.
Ability to independently drive technical discussions, make informed architectural assumptions, and justify technology choices.
Ability to decompose complex "from scratch" tasks without detailed specifications.

Nice to have:

Intermediate knowledge of Airflow for workflow management in data processing.
Generative AI: Experience with vector databases, RAG architectures, or integrating LLMs into data pipelines.
Experience designing and building middleware platforms, REST APIs, and distributed systems at scale.
Web Development: Basic API development skills to provide data access to other services.

Job Responsibilities

Data Platform: Designing and maintaining high-load Spark pipelines for processing terabytes of data.
GenAI Support: Building data infrastructure for training and operating generative models (ingestion, cleaning, vectorization).
Architectural Contribution: Active participation in technical brainstorming, developing data quality standards, and processing protocols.
Optimization: Identifying and resolving bottlenecks in current distributed processing systems.

Required languages

English

B1 - Intermediate

Published 7 April

22 views

0 applications

Response activity: Very high

Last responded yesterday

To apply for this and other jobs on Djinni login or signup.

Only from 3 years of experience
Office, Remote, Hybrid Remote
Ukraine
Countries where we consider candidates
- English B1 - Intermediate

Data Engineer

Employment: Fulltime
Domain: Media
Outsource
Office: Ukraine (Lviv)

Apply for the job

Response activity: Very high

Last responded yesterday

📊 Average salary range of similar jobs in analytics →