Pytohon Data Engineer (Dataricks, AWS, LLM)

This position is very urgent! I would be happy to have a call with you today and give all the details.
 

Location: Remote
Employment Type: Full-time
Contract Duration: 6 months (with possibility of extension)
 

About Us

We are a SaaS company that collects and analyzes large-scale web data to generate actionable consumer insights for global brands. Our platform powers dashboards and data deliveries built on a classified catalog of products, reviews, and social content (posts, videos, and more).

 

We work with high-volume, complex datasets and modern data technologies. Collaboration, continuous learning, and a strong problem-solving mindset are core to our culture.

 

Role Overview

We are looking for a Senior Software Data Engineer to design, build, and optimize scalable, production-grade data pipelines using AWS and Databricks.

 

As a key member of the Data Team, you will be responsible for ensuring data reliability, integrity, and performance. You will also contribute to ML, MLflow, and LLM-based workflows, delivering high-quality, client-ready data solutions on time.

You will collaborate closely with R&D, Product, and Delivery teams to validate features, resolve issues, and ensure smooth, reliable delivery of insights to clients.

 

Responsibilities

  • Design, build, and maintain scalable data pipelines using PySpark, Python, and AWS services
  • Deliver production-ready data outputs with high accuracy, reliability, and timeliness
  • Optimize data processing performance and troubleshoot complex pipeline issues
  • Develop and enhance automated testing and data quality frameworks
  • Integrate ML, MLflow, and LLM-based workflows into existing pipelines
  • Collaborate with Product Managers and Delivery Analysts to define release readiness and client-facing quality standards
  • Promote best practices in data engineering, QA, documentation, and maintainable code

     

Requirements

  • 5+ years of experience as a Data Engineer in production environments
  • Strong expertise in PySpark, Python, and SQL
  • Hands-on experience with AWS (data services and cloud infrastructure)
  • Experience working with Databricks / DBX framework
  • Solid background in automated testing and QA for data pipelines
  • Proven problem-solving and debugging skills, including pipeline optimization
  • Excellent English communication skills and ability to work cross-functionally in distributed teams

     

Nice to Have / Advantages

  • Experience with big data architectures and data lakes
  • Familiarity with CI/CD pipelines and DevOps practices
  • Experience with MLflow and exposure to LLM-based solutions
  • Knowledge of data governance, monitoring, and observability frameworks

     

Why Join Us

  • Work on high-impact data challenges that directly influence client outcomes
  • Be part of a collaborative team building data and AI-powered solutions
  • Fully remote work environment with flexible setup
  • Competitive compensation and opportunity for contract extension

 

Required languages

English B2 - Upper Intermediate
AWS, PySpark, Databricks, MLflow
Published 16 December 2025
41 views
ยท
8 applications
29% read
ยท
15% responded
Last responded 3 weeks ago
To apply for this and other jobs on Djinni login or signup.
Loading...