Big Data Developer Offline

Responsibilities

• Develop highly scalable Realtime and batch data apps leveraging machine learning (using Amazon SageMaker), data regression and rule based models.

• Develop real-time ETL pipelines for multiple data sources ingesting massive amounts of data to create a Datalake

• Build end to end CI\CD pipeline that allows automation of test and deployment

• Tune and optimize data applications to achieve a balanced data application t answer SLA

• Promote best practices coding, code complete standards, OO programing & design patterns

• Champion the overall strategy for data governance, security, and quality that ensure requirements are met.

 

Required Skills

• Experience of coding advanced real-time ETL processes with +3 years of experience

• Proficient in Java/Scala with +3 years of experience

• Strong understanding of the Big Data landscape (Cloudera Hadoop ecosystems, Spark 2.x, Structured Streaming) and related technologies with +2 years of experience

• Strong DevOps scripting skills and relevant tools (Jenkins, Chef, Docker\Vagrant, Kubernetes)

• Relevant AWS tools: S3, SageMaker, RDS, EC2, EMR, DynamoDb, ElasticCache, Glue, Lambda, Athena)

• Other tools and formats: Kafka, Spark, Hue, Zookeeper, Avro, Parquet, Graphite, Grafana, Zabbix, ELK, Jenkins, Rundeck, Spring Boot, Kubernetes, Docker, AeroSpike).

• Deep understanding of strong Computer Science fundamentals: object-oriented design using best practices and design patterns, data structures systems, applications and multithreading programming

• Experience developing large scale distributed systems

• Passion for data engineering automation and efficiency, self-starter, loves challenges, independent, tech savvy

• BSc in computer science or equivalent

The job ad is no longer active

Look at the current jobs Scala Kyiv→

Loading...