Python AI Engineer With MLOps Experience
Nucleus Labs is building AI-powered products across multiple domains, with a strong focus on practical, production-grade systems that leverage modern LLM and machine learning capabilities.
Our engineering teams are actively developing AI-assisted workflows, document-processing pipelines, extraction systems, and human-in-the-loop platforms. While our ML researchers and AI engineers focus on experimentation, model evaluation, and solution discovery, we are looking for an engineer who can transform these solutions into reliable, maintainable, and production-ready systems.
This role sits at the intersection of Software Engineering, MLOps and AI Platform Engineering.
You will act as the bridge between ML research, product engineering, architecture, and infrastructure teams, ensuring that AI solutions move beyond local experimentation and become scalable, observable, and maintainable production systems.
Responsibilities
AI Platform & MLOps Engineering
- Productionalize AI and ML solutions developed by research teams and integrate them into customer-facing products.
- Build and maintain reusable, maintainable, and extensible AI pipelines.
- Design systems that support rapid iteration of AI workflows without requiring full system rewrites or redeployments.
- Implement and maintain human-in-the-loop feedback workflows for AI-assisted products.
- Design mechanisms for collecting feedback, annotations, evaluation data, and model performance signals.
AI Workflow Development
- Build and orchestrate complex LLM-based workflows involving multiple inference steps, tools, and decision points.
- Extend and maintain pipelines involving LLM calls, OCR systems, extraction engines, and data-processing workflows.
- Collaborate with ML engineers to integrate new models, prompts, evaluation approaches, and AI capabilities.
- Ensure AI pipelines remain easy to evolve as business and product requirements change.
Software Engineering
- Design, develop, and maintain backend services that expose AI capabilities through internal and external APIs.
- Build clean, modular, and maintainable software using modern engineering practices.
- Write production-quality code with strong emphasis on readability, testability, and long-term maintainability.
- Participate in architecture discussions and technical design reviews.
- Debug and resolve issues across application, infrastructure, and AI workflow layers.
MLOps & Infrastructure Collaboration
- Work closely with DevOps engineers to design infrastructure required for AI workloads.
- Define infrastructure requirements for deployment, storage, compute, observability, and scaling.
- Collaborate on CI/CD pipelines supporting continuous delivery of AI-enabled systems.
- Contribute to infrastructure-as-code requirements and collaborate on Terraform-based environments.
- Help determine appropriate cloud services and deployment strategies for AI workloads.
Model Lifecycle & Data Operations
- Understand and support the full ML lifecycle, including:
- data collection
- annotation workflows
- training
- validation
- evaluation
- deployment
- monitoring
- continuous improvement
- Design systems that support future model fine-tuning and retraining initiatives.
- Build infrastructure and automation around dataset management, evaluation, and reporting.
Observability & Reliability
- Implement monitoring, logging, tracing, and alerting for AI systems.
- Define metrics that measure model quality, pipeline health, and business outcomes.
- Ensure production AI systems are observable, debuggable, and resilient.
- Participate in incident investigation and root cause analysis for AI-related production issues.
Requirements
- 5+ years of professional software engineering experience.
- Strong commercial experience building backend systems using Python.
- Strong software engineering fundamentals, including clean architecture, SOLID principles, design patterns, and SDLC best practices.
- Experience building and maintaining production-grade distributed systems.
- Hands-on experience integrating LLMs and AI services into production applications.
- Practical understanding of MLOps concepts and the machine learning lifecycle.
- Experience designing and implementing CI/CD pipelines.
- Experience working in cloud environments, preferably AWS.
- Experience working with Docker and containerized applications.
- Experience collaborating with DevOps teams on infrastructure and deployment topics.
- Ability to understand, maintain, and evolve AI/ML codebases created during research and experimentation phases.
- Experience building maintainable APIs and backend services.
- Strong debugging and troubleshooting skills across application and infrastructure layers.
- Strong communication skills and ability to collaborate across cross-functional teams.
Nice to Have
- Experience with AWS services such as ECS, EKS, EC2, Lambda, S3, Step Functions, Bedrock, or SageMaker.
- Experience with workflow orchestration frameworks such as Temporal, Airflow, Dagster, Prefect, or similar.
- Experience with vector databases and retrieval systems.
- Experience with OCR systems and document-processing pipelines.
- Experience fine-tuning open-source models.
- Experience building human-in-the-loop systems.
- Experience with model evaluation frameworks and LLM observability platforms.
- Experience with Terraform or Infrastructure as Code.
- Experience with Kubernetes.
- Experience working with AI agents, agent orchestration frameworks, or tool-calling systems.
- Experience in document AI, legal-tech, healthcare, edtech, or other data-heavy domains.
- Experience working with open-source ML ecosystems.
What Matters for This Role
- Ownership mindset โ ability to take responsibility for AI-enabled systems end-to-end.
- Strong software engineering foundation with practical understanding of ML systems.
- Ability to bridge the gap between ML research and production engineering.
- Pragmatic decision-making and strong engineering judgment.
- Comfort working in ambiguous, fast-moving environments.
- Ability to design systems that are maintainable, extensible, and easy to evolve.
- Curiosity about AI systems combined with strong software craftsmanship.
- Willingness to work across traditional engineering, AI, infrastructure, and product boundaries.