Junior/Middle Data engineer (IRC274101)

Job Description

  • Strong experience in data pipeline development and ETL/ELT processes.
  • Proficiency with Apache Airflow for workflow orchestration.
  • Hands-on experience with object storage solutions, preferably MinIO.
  • Expertise in SQL and database management, specifically PostgreSQL.
  • Experience with graph databases like Neo4j.
  • Familiarity with vector databases such as Qdrant.
  • Ability to work with large, diverse datasets and ensure data integrity.
  • Solid expertise in SQL and relational DBs
  • Experience in database design and optimization
  • Experience with NoSQL DBs (MongoDB, Cosmos, etc.) for handling unstructured and semi-structured data
  • Contributing to release management following the best CI/CD practices

     

Job Responsibilities

  • Design, develop, and maintain robust and scalable data pipelines for ingesting, transforming, and loading diverse datasets.
  • Implement ETL/ELT processes to cleanse, validate, and enrich raw data into query-optimized formats.
  • Orchestrate data workflows using Apache Airflow, including scheduling jobs and managing dependencies.
  • Manage and optimize data storage solutions in MinIO (object storage), PostgreSQL (relational data).
  • Ensure data integrity, quality, and compliance throughout the data lifecycle.
  • Collaborate with cross-functional teams to understand data requirements and deliver data solutions that enable advanced analytics and AI/ML initiatives.
  • Troubleshoot and resolve data-related issues, ensuring high availability and performance of data systems.

 

Department/Project Description

Our client is focused on developing a robust and versatile data ingestion pipeline and associated schema designed to efficiently and accurately collect, process, analyze, and manage diverse data types from various sources in real-time or near real-time.This pipeline will automate and enhance data workflows, ensure data quality, and support advanced analytical capabilities including NLP, Face Recognition, and OCR.

As a Middle Data Engineer on the project, you will play a crucial role in managing deployment, infrastructure, automation, and monitoring. You will be instrumental in setting up and maintaining CI/CD pipelines, managing cloud resources, ensuring system stability and performance, and implementing robust logging and alerting mechanisms for the client platform.If you seek a challenge and want to impact the way the world distributes products from manufacturers to store shelves, we invite you to join our team.

Published 27 August
39 views
ยท
1 application
100% read
ยท
100% responded
Last responded 3 days ago
To apply for this and other jobs on Djinni login or signup.
Loading...