Big Data Developer Offline
Responsibilities
• Develop highly scalable Realtime and batch data apps leveraging machine learning (using Amazon SageMaker), data regression and rule based models.
• Develop real-time ETL pipelines for multiple data sources ingesting massive amounts of data to create a Datalake
• Build end to end CI\CD pipeline that allows automation of test and deployment
• Tune and optimize data applications to achieve a balanced data application t answer SLA
• Promote best practices coding, code complete standards, OO programing & design patterns
• Champion the overall strategy for data governance, security, and quality that ensure requirements are met.
Required Skills
• Experience of coding advanced real-time ETL processes with +3 years of experience
• Proficient in Java/Scala with +3 years of experience
• Strong understanding of the Big Data landscape (Cloudera Hadoop ecosystems, Spark 2.x, Structured Streaming) and related technologies with +2 years of experience
• Strong DevOps scripting skills and relevant tools (Jenkins, Chef, Docker\Vagrant, Kubernetes)
• Relevant AWS tools: S3, SageMaker, RDS, EC2, EMR, DynamoDb, ElasticCache, Glue, Lambda, Athena)
• Other tools and formats: Kafka, Spark, Hue, Zookeeper, Avro, Parquet, Graphite, Grafana, Zabbix, ELK, Jenkins, Rundeck, Spring Boot, Kubernetes, Docker, AeroSpike).
• Deep understanding of strong Computer Science fundamentals: object-oriented design using best practices and design patterns, data structures systems, applications and multithreading programming
• Experience developing large scale distributed systems
• Passion for data engineering automation and efficiency, self-starter, loves challenges, independent, tech savvy
• BSc in computer science or equivalent
The job ad is no longer active
Look at the current jobs Scala Kyiv→