$Not specified
Ship new model architectures by integrating them into the inference engine and empower the product team to create groundbreaking features through user-friendly APIs. Build sophisticated scheduling systems to optimally leverage GPU resources while maintaining CI/CD pipelines for model processing and internal tooling.
Strong generalist Python skills and extensive experience with Kubernetes and Docker are required. Experience with high performance large scale ML systems and multimedia processing is a plus.
Luma’s mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical for intelligence. To go beyond language models and build more aware, capable and useful systems, the next step function change will come from vision. So, we are working on training and scaling up multimodal foundation models for systems that can see and understand, show and explain, and eventually interact with our world to effect change. Role & Responsibilities Ship new model architectures by integrating them into our inference engine Empower our product team to create groundbreaking features by developing user-friendly APIs and interaction patterns Build sophisticated scheduling systems to optimally leverage our expensive GPU resources while meeting internal SLOs Build and maintain CI/CD pipelines for processing/optimizing model checkpoints, platform components, and SDKs for internal teams to integrate into our products/internal tooling. Background Strong generalist Python skills Experience with queues, scheduling, traffic-control, fleet management at scale. Extensive experience with Kubernetes and Docker. Bonus points if you have experience with high performance large scale ML systems (>100 GPUs) and/or Pytorch experience. Bonus points if you have experience with ffmpeg and multimedia processing. Tech stack Must have Python Kubernetes Redis S3-compatible Storage Nice to have Pytorch CUDA Ffmpeg
This job posting was last updated on 8/21/2025