Senior Data Scientist (RAG + Retrieval Expert)

$$$$
Product

Who We Are

At Bennett Data Science, we've been pioneering the use of predictive analytics and data science for over a decade for some of the biggest brands and retailers. We're at the top of our field because we focus on delivering actionable AI for our clients. Our deep experience and product-first attitude set us apart from other groups and gets us the business results our clients want.

 

Why You Should Work With Us

You'll be exposed to a wide range of clients who are at the cutting edge of innovation in their field and get to work on fascinating problems, supporting real products, with real data. We help lots of companies, from some of the largest companies in the world to small startups in Silicon Valley who are building the next big thing.

 

Expert Mentorship: Direct guidance from senior staff with 20+ years of applied ML experience

Competitive Compensation: Market-rate pay with performance upside

Fully Remote: Work from any location of your choice, on a flexible schedule

Real Impact: Your models go into production and serve real users

 

The Role:

As a Senior Data Scientist, you will design, build, and deploy RAG systems and other AI solutions that help teams manage land, infrastructure, and stakeholder relationships more effectively. You'll work with large volumes of documents, permits, geospatial records, and stakeholder communications, turning unstructured and semi-structured data into reliable, queryable intelligence.

You'll own projects end-to-end: scoping the problem with internal and client-facing stakeholders, selecting retrieval and generation strategies, building and evaluating pipelines, and iterating on production systems alongside senior data engineers. You'll also contribute to the team's broader ML work: classification, extraction, geospatial modeling, and mentor junior team members as the practice grows.

This is a hands-on, senior individual contributor role. You'll need to be as comfortable explaining tradeoffs to a non-technical project manager as you are tuning a reranker or debugging retrieval recall. Client-facing communication is part of the job.

 

Requirements

A successful candidate has 5+ years of experience in applied data science and machine learning, with deep hands-on expertise in retrieval-augmented generation and a strong statistical foundation. They demonstrate the following:

RAG & Generative AI (Core Focus)

  • Proven experience designing, building, and deploying RAG-based systems in production, including chunking strategies, embedding model selection, retrieval tuning, and reranking
  • Strong understanding of hallucination mitigation techniques (grounding, citation enforcement, context window management, guardrails) and the ability to articulate tradeoffs across architectures
  • Experience evaluating RAG pipeline quality end-to-end: retrieval recall, answer faithfulness, latency, and cost
  • Familiarity with vector databases and hybrid search (e.g., Databricks, Pinecone, pgvector, or OpenSearch)

Applied ML & Engineering

  • Production ML experience: building, deploying, and maintaining models that serve real users at scale
  • Strong Python skills including scikit-learn, pandas, NumPy, and at least one deep learning framework (PyTorch or TensorFlow)
  • Solid statistical foundation: hypothesis testing, distributions, probability, experimental design

Communication & Working Style

  • Experience translating model behavior, recommendations, and limitations for non-technical stakeholders, particularly in domains where trust and accuracy are critical
  • Comfort working independently across multiple projects simultaneously
  • English proficiency at B2 or above (written and spoken)

 

Nice to Have

  • Experience applying LLMs or transformer-based NLP for structured text classification, information extraction, or embedding-based retrieval
  • Geospatial feature engineering, location-based statistics, spatial indexing, or proximity scoring, particularly in land, infrastructure, or utility corridor contexts
  • Experience with Vision-Language Models (VLMs) or satellite/aerial imagery analysis for document or land parcel interpretation
  • Experience with cloud ML platforms (AWS SageMaker, GCP Vertex AI)
  • Exposure to utilities, energy, infrastructure, land management, or enterprise SaaS domains
  • Experience fine-tuning or adapting pre-trained models (LoRA, PEFT, or full fine-tune)
  • Experience with agentic architectures and deployments

Required skills experience

RAG systems 2.5 years

Required domain experience

Machine Learning / Big Data 5 years

Required languages

English B2 - Upper Intermediate
Ukrainian Native
RAG, LLM, AI, Databricks, Vector Databases
Published 19 May
21 views
ยท
3 applications
Last responded 6 hours ago
To apply for this and other jobs on Djinni login or signup.
Loading...