Data Architect - GCP
About JUTEQ
JUTEQ is an AI-native and cloud-native consulting firm helping enterprises in financial services, telecom, and automotive retail build intelligent, production-grade platforms. We combine the power of GenAI, scalable cloud architecture, and automation to deliver next-generation business tools. Our platform supports multi-tenant AI agent workflows, real-time lead processing, and deep analytics pipelines.
We are seeking an experienced Data Architect with deep Google Cloud Platform (GCP) experience to lead our data lake, ingestion, observability, and compliance infrastructure. This role is critical to building a production-grade, metadata-aware data stack aligned with SOC2 requirements.
What You'll Do
Data Architecture & Lakehouse Design
- Architect and implement a scalable GCP-based data lake across landing, transformation, and presentation zones.
- Use native GCP services such as GCS, Pub/Sub, Apache Beam, Cloud Composer, and BigQuery for high-volume ingestion and transformation.
- Design and implement infrastructure landing zones using Terraform with strong IAM boundaries, secrets management, and PII protection.
- Build ingestion pipelines using Apache NiFi (or equivalent) to support batch, streaming, and semi-structured data from external and internal systems.
Data Ingestion & Integration
- Develop robust ingestion patterns for CRM, CDP, and third-party sources via APIs, file drops, or scraping.
- Build real-time and batch ingestion flows with schema-aware validation, parsing, and metadata handling.
- Implement transformation logic and ensure staging โ curated flow adheres to quality, performance, and lineage standards.
Metadata & Lineage Management
- Define and enforce metadata templates across all sources.
- Establish data lineage tracking from ingestion to analytics using standardized tools or custom solutions.
- Drive schema mapping, MDM support, and data quality governance across ingestion flows.
SRE & Observability for Data Pipelines
- Implement alerting, logging, and monitoring for all ingestion and transformation services using Cloud Logging, Cloud Monitoring, OpenTelemetry, and custom dashboards.
- Ensure platform SLAs/SLOs are tracked and incidents are routed to lightweight response workflows.
- Support observability for cloud functions, GKE workloads, and Cloud Run-based apps interacting with the data platform.
Security & Compliance
- Enforce SOC2 and PII compliance controls: IAM policies, short-lived credentials, encrypted storage, and access logging.
- Collaborate with security teams (internal/external) to maintain audit readiness.
- Design scalable permissioning and role-based access for production datasets.
What We're Looking For
Core Experience
- 5+ years in data engineering or architecture roles with strong GCP experience.
- Deep familiarity with GCP services: BigQuery, Pub/Sub, Cloud Storage, Cloud Functions, Dataflow/Apache Beam, Composer, IAM, and Logging.
- Expertise in Apache NiFi or similar ingestion/orchestration platforms.
- Experience with building multi-environment infrastructure using Terraform, including custom module development.
- Strong SQL and schema design skills for analytics and operational reporting.
Preferred Skills
- Experience in metadata management, MDM, and schema evolution workflows.
- Familiarity with SOC2, GDPR, or other data compliance frameworks.
- Working knowledge of incident response systems, alert routing, and lightweight ITSM integration (JIRA, PagerDuty, etc.).
- Experience with data lineage frameworks (open-source or commercial) is a plus.
- Exposure to graph databases or knowledge graphs is a plus but not required.
Why Join Us
- Help design a full-stack, production-grade data infrastructure from the ground up.
- Work in a fast-paced AI-driven environment with real product impact.
- Contribute to a platform used by automotive dealerships across North America.
- Be part of a high-trust, hands-on team that values autonomy and impact.
Required languages
English | B1 - Intermediate |