Knowtex

via Ashby

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Applied ML Engineer

Anywhere

Full-time

Posted 3/4/2026

Direct Apply

Key Skills:

Python

PyTorch

TensorFlow

AWS

Machine Learning Deployment

Compensation

Salary Range

$Not specified

Responsibilities

Productionize and scale machine learning systems for a voice AI platform focusing on inference optimization and deployment.

Requirements

3-7+ years in applied ML engineering with strong Python and ML framework skills, plus AWS experience.

Full Description

About Knowtex Knowtex is building the future of voice AI operating systems for clinicians, transforming how healthcare documentation happens at the point of care. Founded by Stanford AI scientists with deep clinical experience, we're experiencing explosive growth across both commercial health systems and federal healthcare, with our ambient documentation platform scaling rapidly to thousands of clinicians across hundreds of specialties. We're at an inflection point where cutting-edge AI meets real clinical impact, giving clinicians hours back each day to focus on what matters most - their patients. Position Overview We are seeking an Applied ML Engineer to productionize and scale machine learning systems powering our voice AI platform. This role bridges research and engineering — transforming models into reliable, low-latency, production-grade systems deployed across enterprise healthcare environments. You will work closely with ML Scientists, Backend Engineers, and Platform teams to optimize inference performance, build evaluation pipelines, and ensure robust model deployment in regulated environments. Key Responsibilities Productionize ML models for real-time clinical applications Optimize inference pipelines for low latency and high throughput Deploy and scale models using AWS-based infrastructure Build automated evaluation and regression testing frameworks for LLM outputs Implement monitoring systems for model performance and drift detection Collaborate with Backend teams to integrate ML services into APIs and workflows Improve model efficiency through quantization, batching, caching, and optimization techniques Support specialty-level model evaluation and performance analysis Contribute to CI/CD workflows for ML deployment Required Qualifications 3–7+ years of experience in machine learning engineering or applied ML roles Strong proficiency in Python and PyTorch (or TensorFlow) Experience deploying ML models in production environments Familiarity with transformer architectures and large language models Experience with model optimization techniques (quantization, distillation, pruning) Experience working with cloud infrastructure (AWS preferred) Strong software engineering fundamentals and debugging skills Preferred Qualifications Experience with speech recognition systems or NLP pipelines Experience with Triton Inference Server or similar deployment frameworks Familiarity with healthcare data or clinical documentation workflows Experience working in regulated environments (HIPAA, GovCloud, etc.) Knowledge of medical coding systems (ICD-10, CPT) Technical Environment Python, PyTorch / TensorFlow Transformer-based LLM architectures AWS (SageMaker, ECS, Lambda, S3) Triton Inference Server CI/CD pipelines for ML deployment Observability tools for performance and drift monitoring Compensation & Benefits Meaningful equity compensation Unlimited PTO Premium health, dental, and vision coverage 401(k) plan

This job posting was last updated on 3/5/2026

JobLogr gets you hired faster

Save $15k

in lost income

Get back 54 hrs + hired 3.5x faster

than average job search

Try for Free

No credit card required

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »