Senior Data Engineer(AI)
๐ Location: Remote / Europe
๐ Client: Subsidiary of one of the โBig Fourโ accounting organizations
๐ข Industry: Professional Services (Audit, Tax, Consulting, Risk & Advisory)
About the Client
You will join the 6th-largest privately owned organization in the United States, part of the Big Four and the largest professional services network in the world in terms of revenue and headcount. With more than 263,900 professionals globally, the company provides audit, tax, consulting, enterprise risk, and financial advisory services worldwide.
About the Project
As a Senior Data Engineer (AI), you will become part of a cross-functional development team building GenAI solutions for digital transformation across enterprise products.
The team is responsible for the design, development, and deployment of innovative enterprise technologies, tools, and standardized processes to support the delivery of tax services. It is a dynamic environment bringing together professionals from tax, technology, change management, and project management backgrounds.
The work involves consulting and execution across initiatives including:
- process and tool development
- implementation of AI-driven solutions
- training development
- engagement management
- tool design & rollout
Project Tech Stack
- Cloud: Azure Cloud
- Architecture: Microservices
- Backend: .NET 8, ASP.NET Core, Python
- Databases: MongoDB, Azure SQL, Vector DBs (Milvus, Postgres, etc.)
- Frontend: Angular 18, Kendo
- Collaboration: GitHub Enterprise with Copilot
- Big Data: Hadoop, MapReduce, Kafka, Hive, Spark, SQL & NoSQL
Requirements
- 6+ years of hands-on experience in software development
- Strong coding skills in SQL and Python, with solid CS fundamentals (data structures & algorithms)
- Practical experience with Hadoop, MapReduce, Kafka, Hive, Spark, SQL & NoSQL warehouses
- Experience with Azure cloud data platform
- Hands-on experience with vector databases (Milvus, Postgres, etc.)
- Knowledge of embedding models and retrieval-augmented generation (RAG)
- Understanding of LLM pipelines, including data preprocessing for GenAI models
- Experience deploying data pipelines for AI/ML workloads (scalability & efficiency)
- Familiarity with model monitoring, feature stores (Feast, Vertex AI), and data versioning
- Experience with CI/CD for ML pipelines (Kubeflow, MLflow, Airflow, SageMaker Pipelines)
- Understanding of real-time streaming for ML model inference (Kafka, Spark Streaming)
- Strong knowledge of Data Warehousing (design, implementation, optimization)
- Knowledge of Data Quality testing, automation & visualization
- Experience with BI tools (PowerBI dashboards & reporting)
- Experience supporting data scientists and complex statistical use cases is highly desirable
Responsibilities
- Design, build, deploy, and maintain mission-critical analytics solutions processing terabytes of data at scale
- Contribute to design, coding, configurations and manage data ingestion, real-time streaming, batch processing, and ETL across multiple storages
- Optimize and tune performance of complex SQL queries and large-scale data flows
- Ensure data reliability, scalability, and efficiency across AI/ML workloads