At Navana, we are at the forefront of developing Voice AI solutions tailored for Indic languages, driving growth and innovation for large enterprises.
We build real-time voice AI systems that power critical customer interactions for leading enterprises. Our stack is built for low-latency, high-throughput workloads and is deployed both in our cloud and on customer infrastructure. As we scale, we are investing in the platform that our Speech and Language models (STT, TTS, and SLMs) train on, deploy through, and run on in production.
Core Responsibilities
- Build the MLOps stack from the ground up - experiment tracking, model registry, pipeline orchestration, artifact management, and data/model versioning.
- Evaluate and select tooling across categories (e.g., MLflow / W&B, Kubeflow / Argo Workflows / Airflow, DVC / LakeFS) and own the implementation.
- Design training pipelines that run reliably across hyperscalers (AWS / GCP / Azure) and neo-cloud GPU providers.
- Productionize and optimize STT, TTS, and SLM inference in customer on-prem environments, within tight latency budgets and on constrained hardware.
- Own GPU infrastructure end to end - drivers, CUDA, MIG partitioning, and mixed-GPU scheduling.
- Establish the MLOps practices that the rest of the company will build on.
Must-Have Qualifications
- Demonstrated DevOps, SRE, or MLOps skills equivalent to 2-4 years of experience, with at least one ML system shipped to production.
- Strong Linux fundamentals and solid networking basics.
- Real depth with at least one hyperscaler (AWS, GCP, or Azure).
- Strong in Python and Bash, and comfortable with Docker, Kubernetes, and Git.
- Basic infrastructure-as-code experience with Terraform and/or Ansible.
- Familiarity with the MLOps lifecycle in a real setting with hands-on with one or more MLOps tools.
Nice-to-Have
- Experience tuning model serving stacks under latency constraints (Triton, vLLM, Ray Serve).