via Himalayas.app
$140K - 180K a year
Build and scale ML-optimized HPC infrastructure using Kubernetes-based GPU/TPU superclusters and collaborate with cloud providers.
Requires deep ML/HPC infrastructure expertise, Kubernetes at scale, strong Python and Go programming, low-level systems knowledge, and research collaboration experience.
We're hiring a Staff Software Engineer to build and scale ML-optimized HPC infrastructure for AI workloads. The role involves deploying and managing Kubernetes-based GPU/TPU superclusters, collaborating with cloud providers, and driving innovation in ML infrastructure. Requirements • Deep expertise in ML/HPC infrastructure • Kubernetes at scale • Strong programming skills in Python and Go • Low-level systems knowledge • Research collaboration experience Benefits • An open and inclusive culture and work environment • Weekly lunch stipend, in-office lunches & snacks • Full health and dental benefits • 100% Parental Leave top-up for up to 6 months • Personal enrichment benefits • Remote-flexible work arrangement
This job posting was last updated on 11/24/2025