Data Architect (AWS and Python FastAPI)

Client

Our client is a leading legal recruiting company focused on building a cutting-edge data-driven platform for lawyers and law firms. The platform consolidates news and analytics, real-time deal and case tracking from multiple sources, firm and lawyer profiles with cross-linked insights, rankings, and more — all in one unified place.

 

Position overview

We are seeking a skilled Data Architect with strong expertise in AWS technologies (Step Functions, Lambda, RDS - PostgreSQL), Python, and SQL to lead the design and implementation of the platform’s data architecture. This role involves defining data models, building ingestion pipelines, applying AI-driven entity resolution, and managing scalable, cost-effective infrastructure aligned with cloud best practices.

 

Responsibilities

  • Define entities, relationships, and persistent IDs; enforce the Fact schema with confidence scores, timestamps, validation status, and source metadata.
  • Blueprint ingestion workflows from law firm site feeds; normalize data, extract entities, classify content, and route low-confidence items for review.
  • Develop a hybrid of deterministic rules and LLM-assisted matching; configure thresholds for auto-accept, manual review, or rejection.
  • Specify Ops Portal checkpoints, data queues, SLAs, and create a corrections/version history model.
  • Stage phased rollout of data sources—from ingestion through processing, storage, replication, to management via CMS.
  • Align architecture with AWS and Postgres baselines; design for scalability, appropriate storage tiers, and cost-effective compute and queuing solutions.

 

Requirements

  • Proven experience as a Data Architect or Senior Data Engineer working extensively with AWS services.
  • Strong proficiency in Python development, preferably with FastAPI or similar modern frameworks.
  • Deep understanding of data modeling principles, entity resolution, and schema design for complex data systems.
  • Hands-on experience designing and managing scalable data pipelines, workflows, and AI-driven data processing.
  • Familiarity with relational databases such as PostgreSQL.
  • Solid experience in data architecture, including data modelling. Knowledge of different data architectures such as Medallion architecture, Dimensional modelling
  • Strong knowledge of cloud infrastructure cost optimization and performance tuning.
  • Excellent problem-solving skills and ability to work in a collaborative, agile environment.

 

Nice to have

  • Experience within legal tech or recruiting data domains.
  • Familiarity with Content Management Systems (CMS) for managing data sources.
  • Knowledge of data privacy, security regulations, and compliance standards.
  • Experience with web scraping.
  • Experience with EMR and SageMaker.

Required skills experience

AWS 6 years
Python 6 years
EMR

Required languages

English B2 - Upper Intermediate
EMR, SageMaker
Published 30 October
23 views
·
0 applications
To apply for this and other jobs on Djinni login or signup.
Loading...