Data Engineer
Job Description
Total of 7+ years of development/design experience with a minimum of 5 years experience in Big Data technologies on-prem or on cloud.
Experience with architecting, building, implementing and managing Big Data platforms On Cloud, covering ingestion (Batch and Real time), processing (Batch and Realtime), Polyglot Storage, Data Analytics and Data Access
Good understanding of Data Governance, Data Security, Data Compliance, Data Quality, Meta Data Management, Master Data Management, Data Catalog
Proven understanding and demonstrable implementation experience of big data platform technologies on cloud (AWS and Azure) including surrounding services like IAM, SSO, Cluster monitoring, Log Analytics etc
Experience working with Enterprise Data Warehouse technologies, Multi-Dimensional Data Modeling, Data Architectures or other work related to the construction of enterprise data assets
Strong Experience implementing ETL/ELT processes and building data pipelines including workflow management, job scheduling and monitoring
Experience building stream-processing systems, using solutions such as Apache Spark, Databricks, Kafka etc...
Experience with Spark/Databricks technology is a must
Experience with Big Data querying tools
Solid skills in Python
Strong experience with data modelling and schema design
Strong SQL programming background
Excellent interpersonal and teamwork skills
Experience to drive solution/enterprise-level architecture, collaborate with other tech leads
Strong problem solving, troubleshooting and analysis skills
Experience working in a geographically distributed team
Experience with leading and mentorship of other team members
Good knowledge of Agile Scrum
Good communication skills
Job Responsibilities
Work directly with the Client teams to understand the requirements/needs and rapidly prototype data and analytics solutions based upon business requirements
Design, implement and manage large scale data platform/applications including ingestion, processing, storage, data access, data governance capabilities and related infrastructure
Support Design and development of solutions for the deployment of data analytics notebooks, tools, dashboards and reports to various stakeholders
Communication with Product/DevOps/Development/QA team
Architect data pipelines and ETL/ELT processes to connect with various data sources
Design and maintain enterprise data warehouse models
Take part in the performance optimization processes
Guide on research activities (PoC) if necessary
Manage cloud based data & analytics platform
Establishing best practices with CI\CD under BigData scope
Required skills experience
| PySpark | 5 years |
| Databricks | 5 years |
Required languages
| English | B2 - Upper Intermediate |