AI Data Engineer

$$$$

Meet the YozmaTech

YozmaTech isn’t just another tech company – we’re a global team of go-getters, innovators, and A-players helping startups and product companies scale smarter and faster.
We build dedicated development teams across 10+ countries, creating strong, long-term partnerships based on trust, transparency, and real impact.
Here, every idea counts. We value people who are proactive, open-minded, and ready to grow. If you’re passionate about building meaningful products and want to join a team that feels like family – you’ll feel right at home with us.

Our client is looking for an AI Data Engineer to help build, maintain, and improve internal and client-facing LLM-powered systems. This role sits at the intersection of data engineering, retrieval infrastructure, and production AI operations, with a strong focus on reliability, retrieval quality, scalability, and operational excellence.

A technology company focused on helping businesses move faster with AI by designing and delivering practical, production-grade systems across workflows, operations, and knowledge-heavy use cases. The team works closely with clients to turn AI from experimentation into reliable, scalable execution.

You will work on the pipelines, indexing systems, evaluation frameworks, and production infrastructure that power AI assistants using hosted LLM APIs and internal knowledge sources. This is a hands-on role for someone who can think across ingestion, search, observability, and system performance.

Key Requirements:

🔹 Strong programming skills, especially in Python;
🔹 Experience building ETL and data pipelines in production environments;
🔹 Strong SQL skills and experience with relational databases, preferably, PostgreSQL;
🔹 Experience with search and retrieval systems, including OpenSearch, Elasticsearch, or similar platforms;
🔹 Familiarity with vector databases, embeddings workflows, and large-scale, document indexing;
🔹 Experience with cloud platforms such as AWS and related infrastructure services;
🔹 Familiarity with Git, CI/CD pipelines, and modern engineering workflows;
🔹 Strong problem-solving skills and comfort working across data, infrastructure, and AI application layers;
🔹 English – Upper-Intermediate or higher.

Will be plus:

🔹 Experience working on RAG systems, internal knowledge assistants, or search-heavy AI applications;
🔹 Familiarity with observability stacks, distributed systems, and workflow, orchestration tools;
🔹 Experience with access control, permission-aware systems, and auditability in enterprise environments;
🔹 Exposure to evaluation frameworks for LLM systems and model benchmarking;

What you will do:

🔹Working with Vector Databases
🔹Maintain and improve ingestion and enrichment pipelines for internal and client content, including parsing, extraction, normalization, metadata enrichment, deduplication, and quality monitoring
🔹Improve indexing and retrieval quality through chunking and segmentation refinements, embedding and index update workflows, metadata filtering, and caching
🔹Support hybrid retrieval architectures combining vector search, keyword or BM25 search, and metadata-aware filtering
🔹Implement and maintain access-aware retrieval by propagating and enforcing document permissions at indexing and query time, including audit logs and
validation tests
🔹Improve source attribution so responses consistently point to the correct documents, sections, and references in a reliable format
🔹Extend and harden tool execution, workflow orchestration, and automations, including retries, timeouts, idempotency, concurrency controls, and run history
🔹Develop and maintain evaluation and regression testing frameworks, including golden datasets, automated scoring, and structured comparisons across LLM providers and models
🔹Operate AI systems in production, including logs, metrics, tracing, alerting, incident response, performance tuning, cost monitoring, and runbook documentation
🔹Build scalable infrastructure to process, embed, index, and search very large document collections efficiently

Interview stages:

🔹 HR interview;
🔹 30-min screening call;
🔹 Technical interview;
🔹 Test assignment/homework;
🔹 Reference check;
🔹 Offer;

Why Join Us?

At YozmaTech, we’re self-starters who grow together. Every day, we tackle real challenges for real products – and have fun doing it. We work globally, think entrepreneurially, and support each other like family. We invest in your growth and care about your voice. With us, you’ll always know what you’re working on and why it matters.
From day one, you’ll get:
🔹 Direct access to clients and meaningful products;
🔹 Flexibility to work remotely or from our offices;
🔹 A-team colleagues and a zero-bureaucracy culture;
🔹 Opportunities to grow, lead, and make your mark.

After you apply

We’ll keep it respectful, clear, and personal from start to offer.
You’ll always know what project you’re joining – and how you can grow with us.

Everyone’s welcome

Diversity makes us better. We create a space where you can thrive as you are.

Ready to build something meaningful?

Let’s talk. Your next big adventure might just start here.

Required skills experience

Data Engineering	5 years
Python	3 years

Required languages

English

B2 - Upper Intermediate

Published 27 April

96 views

29 applications

Response activity: Low

Last responded 1 week ago

See stats of candidates who applied for this job 👀

See applicant insights

To apply for this and other jobs on Djinni login or signup.

Only from 5 years of experience
Full Remote
Worldwide
Countries where we consider candidates
- English B2 - Upper Intermediate

Data Engineer

Data Engineering	5 years
Python	3 years

Employment: Fulltime
Domain: Other
Outstaff
Test task is needed

Apply for the job

Response activity: Low

Last responded 1 week ago

📊 $4000-6000 Average salary range of similar jobs in analytics →