Data Platform Engineer
WHO WE ARE
At Bennett Data Science, we’ve been pioneering the use of predictive analytics and data science for over ten years, for some of the biggest brands and retailers. We’re at the top of our field because we focus on actionable technology that helps people around the world. Our deep experience and product-first attitude set us apart from other groups and it's why people who work with us tend to stay with us long term.
WHY YOU SHOULD WORK WITH US
You'll work on an important problem that improves the lives of a lot of people. You'll be at the cutting edge of innovation and get to work on fascinating problems, supporting real products, with real data. Your perks include: expert mentorship from senior staff, competitive compensation, paid leave, flexible work schedule and ability to travel internationally.
Essential Requirements for Data Platform Engineer:
- Architecture & Improvement: Continuously review the current architecture and implement incremental improvements, facilitating a gradual transition of production operations from Data Science to Engineering.
- AWS Service Ownership: Own the full lifecycle (development, deployment, support, and monitoring) of client-facing AWS services (including SageMaker endpoints, Lambdas, and OpenSearch). Maintain high uptime and adherence to Service Level Agreements (SLAs).
- ETL Operations Management: Manage all ETL processes, including the operation and maintenance of Step Functions and Batch jobs (scheduling, scaling, retry/timeout logic, failure handling, logging, and metrics).
- Redshift Operations & Maintenance: Oversee all Redshift operations, focusing on performance optimization, access control, backup/restore readiness, cost management, and general housekeeping.
- Performance Optimization: Post-stabilization of core monitoring and pipelines, collaborate with the Data Science team on targeted code optimizations to enhance reliability, reduce latency, and lower operational costs.
- Security & Compliance: Implement and manage the vulnerability monitoring and remediation workflow (Snyk).
- CI/CD Implementation: Establish and maintain robust Continuous Integration/Continuous Deployment (CI/CD) systems.
- Infrastructure as Code (Optional): Utilize IaC principles where necessary to ensure repeatable and streamlined release processes.
Mandatory Hard Skills:
- AWS Core Services: Proven experience with production fundamentals (IAM, CloudWatch, and VPC networking concepts).
- AWS Deployment: Proficiency in deploying and operating AWS SageMaker and Lambda services.
- ETL Orchestration: Expertise in using AWS Step Functions and Batch for ETL and job orchestration.
- Programming & Debugging: Strong command of Python for automation and troubleshooting.
- Containerization: Competence with Docker/containers (build, run, debug).
- Version Control & CI/CD: Experience with CI/CD practices and Git (GitHub Actions preferred).
- Data Platform Tools: Experience with Databricks, or a demonstrated aptitude and willingness to quickly learn.
Essential Soft Skills:
- Accountability: Demonstrate complete autonomy and ownership over all assigned systems ("you run it, you fix it, you improve it").
- Communication: Fluent in English; capable of clear, direct communication, especially during incidents.
- Prioritization: A focus on delivering a minimally-supportable, deployable solution to meet deadlines, followed by optimization and cleanup.
- Incident Management: Maintain composure under pressure and possess strong debugging and incident handling abilities.
- Collaboration: Work effectively with the Data Science team while communicating technical trade-offs clearly and maintaining momentum.
Required languages
| English | B1 - Intermediate |