Senior Data Engineer

We are looking for a Data Engineer to help build a next-generation data and AI platform. The platform combines real-time data ingestion, multi-database storage (relational, analytical, graph, vector), and AI-driven analytics. You will design scalable pipelines, ensure data quality, manage governance, and optimize data flows for analytics and ML/AI workloads.

 

Key Responsibilities:

 

     1. Data Ingestion & Integration

  • Build and maintain scalable data ingestion pipelines (batch and streaming) from enterprise systems (ERP, CRM, WMS, IoT).
  • Apply transformations, masking, and validations for regulatory compliance (e.g., HIPAA, GDPR).

     

    2. ETL/ELT & Data Processing

  • Develop ETL/ELT workflows using tools like Airflow or Spark.
  • Work with ML/AI teams to structure data for analytics, simulations, and LLM-powered use cases.

     

    3. Multi-Database Storage

  • Design and optimize data storage across:
    • Relational (e.g., PostgreSQL)
    • Analytical (e.g., Snowflake, BigQuery)
    • Graph (e.g., Neo4j)
    • Vector (e.g., Pinecone, Milvus)
  • Align storage design with specific workloads for maximum efficiency.

    4. Governance & Data Quality
  • Implement data quality checks, lineage, metadata management, and MDM practices.
  • Apply secure data handling and role-based access control.

     

    5. Performance & Scalability

  • Monitor and optimize pipelines for latency, throughput, and reliability.
  • Apply best practices in distributed and streaming data processing (e.g., Kafka, Spark, Flink).
     

    6. Collaboration & Documentation

  • Work with DevOps, Data Science, and AI/ML teams to align pipelines with product needs.
  • Maintain clear documentation of data flows and governance policies.
     

Qualifications:

  • 5+ years in data engineering, ETL/ELT pipeline development, data lakes, or streaming systems.
  • Strong experience with Python (incl. FastAPI) and SQL.
  • Experience with cloud platforms, primarily GCP, and some AWS.
  • Hands-on with tools like Kafka, Airflow, Spark.
  • Familiar with relational, analytical, NoSQL, graph, and vector databases.
  • Understanding of metadata, lineage, MDM, and data compliance (GDPR, HIPAA).
  • Strong grasp of data security, encryption, and access control.
  • Excellent communication and cross-functional collaboration skills.
     

Why This Role

Impact: Contribute to building a core data foundation for a powerful AI platform.

Growth: Gain experience across streaming, multi-DB architectures, and AI-focused data use cases.

Collaboration: Work alongside data scientists, engineers, and AI researchers to create automated, intelligent solutions.

We offer:

  • Competitive compensation based on experience and skills.
  • Flexible working hours and remote work environment.
  • Opportunities for professional growth and development.
  • Collaborative and innovative team culture.
  • Participation in exciting and challenging projects. 

This position offers the opportunity to shape a robust data ecosystem for a cutting-edge AI platform, ensuring the quality and reliability of data that underpins analytics, process intelligence, and decision automation.

Published 15 April
56 views
ยท
3 applications
To apply for this and other jobs on Djinni login or signup.