Lead Data Engineer
Our customer is the leading school transportation provider in North America, being the owner of more than a half of all yellow school buses in the United States. Every day, the company completes 5 million student journeys, moving more passengers than all U.S. airlines combined and delivers reliable, quality services for 1,100 school districts.
N-iX has built a successful cooperation with the client delivering a range of complex initiatives. As a result, N-iX has been selected as a strategic long-term partner to drive the digital transformation on an enterprise level, fully remodeling the technology landscape for 55,000 employees and millions of people across North America.
Responsibilities:
- Design complex ETL processes of various data sources in the data warehouse
- Build new and maintain existing data pipelines using Python to improve efficiency and latency
- Improve data quality through anomaly detection by building and working with internal tools to measure data and automatically detect changes
- Identify, design, and implement internal process improvements, including re-designing infrastructure for greater scalability, optimizing data delivery, and automating manual processes
- Perform data modeling and improve our existing data models for analytics
- Collaborate with SMEs, architects, analysts, and others to build solutions that integrate data from many of our enterprise data sources
Partner with stakeholders, including data, design, product, and executive teams, and assist them with data-related technical issues
Requirements:
- Proficiency in Python 7+ years
- 3-5 years of commercial experience in building and maintaining a Data Lake
- Experience leading a Data Lake team of 3-5 Engineers (2 years)
- Good knowledge of AWS cloud services, including the Glue framework with integration type of projects (2 years)
- Experience maintaining Apache Kafka
- Steady expertise in data processing tools, including Redis, Apache Spark, Apache Iceberg, Athena.
- Knowledge of job scheduling and orchestration using Airflow
- Experience in events streaming
- Well-versed in the optimization of ETL processes
- Experience of developing high-load backend services on Python.
- Good understanding of algorithms and data structures
Excellent communication skills, both written and verbal
Nice to have:
- Experience in the schema and dimensional data design
- Collaboration within a scaled team, using Agile methodology
Decent knowledge of CI/CD (Docker, Cloud formation, Git)
We offer*:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
Required skills experience
Python | 7 years |
Delta Lake | 3 years |
Lead | 2 years |
Azure Cloud Services | 2 years |
Required languages
English | B2 - Upper Intermediate |