Lead Big Data Engineer
5+ years of hands-on experience in the big-data field: Hadoop, Spark, Spark Streaming, MR.
-3+ years of experience with relevant cloud data services (AWS, Azure, GCP) like EMR, Glue, S3, Lambda, Fargate, DynamoDB, ADF, Azure functions, Azure Blob Storage, DataProc & DataFlow, BigQuery.
-Knowledge of DataBricks will be significant benefit for the candidate.
- Excellent knowledge and hands-on experience of SQL in context of Big Data: Spark SQL, Hive QL
- Ability to understand and optimize Spark execution plans via Spark UI
- Excellent knowledge of Python or Scala including Functional Scala Libs (Cats) or Java SE
- Batch processing and ETL principles in Data Warehouses.
- Data completeness signals and orchestration
- Approaches for historical reprocessing and data correction
- Handling bad data and late data in inputs and outputs
- Schema migrations and datasets evolution
- B2+ Strong speaking English.
- Ability to learn fast new set of tools and technology used internally: Radar, platform services, telemetry providers, Spark-as-a-Service, build system and much more.
Will be a plus
- Understanding of functional programming ideas and principles.
- Experience in building and using web services.
- Experience with any of Teradata, Vertica, Oracle, Tableau.
- Spark Streaming and Kafka.
- Experience or knowledge of: Apache Iceberg, Trino (Presto), Druid, Cassandra, Blob storage like AWS.
- Understanding or experience with Azkaban or Airflow.