Lead Prototype Developer โ Voice AI Agent Platform (MVP)
Company Overview
Coldi.ai is building the future of AI-powered sales conversations. Our platform enables businesses to run autonomous, human-like voice agents that qualify leads, book meetings, and drive revenue through natural phone interactions. As our first core AI developer, youโll own the end-to-end prototype โ from real-time speech pipeline to intelligent dialogue โ in just 4โ6 months. Youโll collaborate with our IT team (handling telephony, infra, and integrations) and report directly to the founders.
Location: Remote, or onsite in Tel Aviv, Israel or Kyiv, Ukraine
Type: Full-time (contract-to-hire option)
Role Summary
Build a fully functional Voice AI platform prototype capable of:
Answering live calls
Real-time transcription & understanding
Context-aware, multi-turn conversation via LLMs
Natural, interruptible speech synthesis
Outgoing call campaign functionalities
Youโll focus on core AI pipeline and real-time logic โ our IT team will support with:
Telephony setup (Twilio/Vonage)
Cloud infra (AWS/GCP)
CI/CD, monitoring, and scaling
Deliver a testable dashboard for internal demos and early customer pilots.
Key Responsibilities
Design and implement low-latency voice pipeline: STT โ LLM โ TTS (<500ms end-to-end)
Build LangChain-based agent with memory, tool calling, and dynamic routing
Implement interruption handling, turn-taking, and sentiment-aware responses
Integrate streaming STT/TTS (Deepgram, ElevenLabs, etc.)
Develop dashboards (HTMX + Tailwind or plain HTML/JS)
Optimize audio handling (noise suppression, accent robustness)
Write clean, documented code; enable fast iteration and handoff
Create Reporting & Analysis dashboards and Outgoing Campaigns UI
Required Qualifications
Category
Requirement
Experience
3+ years in software development
2+ years hands-on with AI/ML systems (LLMs, voice, or real-time agents)
Programming Languages
Python (expert): async/await, FastAPI, Pydantic v2, type safety
JavaScript/TypeScript (proficient): Node.js, WebRTC, WebSockets
Core AI Framework
LangChain or similar (required): agents, memory, tools, streaming, RAG, custom chains
Voice & Real-Time Stack
- Streaming STT: Deepgram, AssemblyAI, or Whisper (real-time)
- Streaming TTS: ElevenLabs, PlayHT, or Azure Neural
- WebRTC fundamentals (peer connections, media streams)
- Audio processing: PyAudio, Web Audio API, or FFmpeg
LLM Integration
OpenAI, Anthropic, or Hugging Face; prompt engineering, function calling, streaming responses
Backend & Data
FastAPI or Express; PostgreSQL + Redis; basic vector DB (Pinecone/Weaviate)
Deployment
Docker; cloud deployment (AWS/GCP/Vercel); environment management
Soft Skills
Rapid prototyping under ambiguity; clear technical communication; ownership mindset
Preferred (Nice-to-Have)
Built voice agents (e.g., Vapi, Bland, or custom)
React/Svelte for future polished UIs
Go/Rust for performance modules
Experience in sales tech, CRM tools, or lead qualification logic
Open-source AI/voice contributions
Why Join Coldi.ai?
First AI hire: Define the entire technical foundation
High ownership: Your code powers our first customers
Backed IT support: Focus on AI, not ops
Growth potential: Scale from prototype to product lead
Perks: API credits, learning budget, flexible hours, relocation support (if onsite)C
Required skills experience
| Learning & Development | 3 years |
| Python | 3 years |
| LLM / AI systems | 3 years |
| Back end | 3 years |
| JavaScript | 3 years |
| TypeScript | 3 years |
Required languages
| English | C1 - Advanced |