Lead Prototype Developer โ€“ Voice AI Agent Platform (MVP)

Company Overview

Coldi.ai is building the future of AI-powered sales conversations. Our platform enables businesses to run autonomous, human-like voice agents that qualify leads, book meetings, and drive revenue through natural phone interactions. As our first core AI developer, youโ€™ll own the end-to-end prototype โ€” from real-time speech pipeline to intelligent dialogue โ€” in just 4โ€“6 months. Youโ€™ll collaborate with our IT team (handling telephony, infra, and integrations) and report directly to the founders.

Location: Remote, or onsite in Tel Aviv, Israel or Kyiv, Ukraine

Type: Full-time (contract-to-hire option)

 

Role Summary

Build a fully functional Voice AI platform prototype capable of:

Answering live calls

Real-time transcription & understanding

Context-aware, multi-turn conversation via LLMs

Natural, interruptible speech synthesis

Outgoing call campaign functionalities

Youโ€™ll focus on core AI pipeline and real-time logic โ€” our IT team will support with:

Telephony setup (Twilio/Vonage)

Cloud infra (AWS/GCP)

CI/CD, monitoring, and scaling

Deliver a testable dashboard for internal demos and early customer pilots.

 

Key Responsibilities

Design and implement low-latency voice pipeline: STT โ†’ LLM โ†’ TTS (<500ms end-to-end)

Build LangChain-based agent with memory, tool calling, and dynamic routing

Implement interruption handling, turn-taking, and sentiment-aware responses

Integrate streaming STT/TTS (Deepgram, ElevenLabs, etc.)

Develop dashboards (HTMX + Tailwind or plain HTML/JS)

Optimize audio handling (noise suppression, accent robustness)

Write clean, documented code; enable fast iteration and handoff

Create Reporting & Analysis dashboards and Outgoing Campaigns UI

 

Required Qualifications

Category

Requirement

Experience

3+ years in software development

2+ years hands-on with AI/ML systems (LLMs, voice, or real-time agents)

Programming Languages

Python (expert): async/await, FastAPI, Pydantic v2, type safety

JavaScript/TypeScript (proficient): Node.js, WebRTC, WebSockets

Core AI Framework

LangChain or similar (required): agents, memory, tools, streaming, RAG, custom chains

Voice & Real-Time Stack

- Streaming STT: Deepgram, AssemblyAI, or Whisper (real-time)

- Streaming TTS: ElevenLabs, PlayHT, or Azure Neural

- WebRTC fundamentals (peer connections, media streams)

- Audio processing: PyAudio, Web Audio API, or FFmpeg

LLM Integration

OpenAI, Anthropic, or Hugging Face; prompt engineering, function calling, streaming responses

Backend & Data

FastAPI or Express; PostgreSQL + Redis; basic vector DB (Pinecone/Weaviate)

Deployment

Docker; cloud deployment (AWS/GCP/Vercel); environment management

Soft Skills

Rapid prototyping under ambiguity; clear technical communication; ownership mindset

 

Preferred (Nice-to-Have)

Built voice agents (e.g., Vapi, Bland, or custom)

React/Svelte for future polished UIs

Go/Rust for performance modules

Experience in sales tech, CRM tools, or lead qualification logic

Open-source AI/voice contributions

 

Why Join Coldi.ai?

First AI hire: Define the entire technical foundation

High ownership: Your code powers our first customers

Backed IT support: Focus on AI, not ops

Growth potential: Scale from prototype to product lead

Perks: API credits, learning budget, flexible hours, relocation support (if onsite)C

Required skills experience

Learning & Development 3 years
Python 3 years
LLM / AI systems 3 years
Back end 3 years
JavaScript 3 years
TypeScript 3 years

Required languages

English C1 - Advanced
Voice & Real-Time Stack, LLM Integration, FastAPI or Express, Docker, environment management
Published 4 December
16 views
ยท
2 applications
100% read
To apply for this and other jobs on Djinni login or signup.
Loading...