Senior/Regular Data Engineer (Python, Spark, Hadoop) Offline
Responsibilities
β’ Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and Azure 'big data' technologies;
β’ Implement data flows connecting operational systems, BI systems, and the big data platform;
β’ Build real-time, reliable, scalable, high-performing, distributed, fault tolerant systems;
β’ Clean and transform data into a usable state for analytics. Build data dictionary;
β’ Create data tools for analytics and data scientist team members that assist them in their ML endeavors;
β’ Design and develop code, scripts and data pipelines that leverage structured and unstructured data;
β’ Implement measures to address data privacy, security, compliance.
Skills
Must have
HiveQL, Scala, Java, Apache HBase, Python, Kafka Streams, Big Data, Apache Kafka, Hadoop
β’ Experience with designing data and analytics architectures in Microsoft Azure cloud;
β’ Experience with Big Data technologies like Spark, Hadoop, Hive, HBase, Kafka etc.;
β’ Fluency in several programming languages such as Python, Scala, Java, with the ability to pick up new languages and technologies quickly;
β’ Experience with data warehousing, data ingestion, and data profiling;
β’ Demonstrated teamwork, strong communication skills, and collaborative in complex engineering projects.
Nice to have
BS in computer science or related STEM field;
The job ad is no longer active
Job unpublished on
12 February 2021
Look at the current jobs (Other) Remoteβ