via RemoteFront
$150K - 220K a year
Architect, build, and operate large-scale distributed backend systems and storage layers with focus on scalability, reliability, and operational excellence.
12+ years backend experience with Golang and Kubernetes in cloud environments, expertise in distributed systems, high-scale storage, event-driven systems, and strong software engineering practices.
At Docker, we make app development easier so developers can focus on what matters. Our remote-first team spans the globe, united by a passion for innovation and great developer experiences. With over 20 million monthly users and 20 billion image pulls, Docker is the #1 tool for building, sharing, and running apps—trusted by startups and Fortune 100s alike. We’re growing fast and just getting started. Come join us for a whale of a ride! We’re looking for a Staff Backend Engineer with extensive experience in distributed systems, large-scale backend architecture, and high-volume storage systems. You will own systems end-to-end—from schema design and API architecture to deployment, observability, and operational excellence. The work is highly dynamic, and you will operate in a fast-paced environment where we continuously evolve the platform to support enormous growth in traffic, data, and global usage. You’ll collaborate across engineering, SRE, Product, and Design while acting as a technical leader who simplifies complexity, elevates engineering quality, and improves a globally critical developer platform. If you’re passionate about building and operating massive-scale distributed systems with huge data and throughput demands, this role is for you. Responsibilities Distributed Systems & Backend Engineering • Architect, build, and operate high-scale distributed systems powering Docker Hub’s registry platform—spanning artifact storage, metadata services, indexing workflows, and performance-critical APIs. • Lead the design and implementation of backend services with a strong emphasis on scalability, correctness, resilience, and performance. • Drive major initiatives around multi-region replication, caching strategies, request-path optimization, and core registry reliability. Data Infrastructure & Storage • Design, optimize, operate the data and storage layers - for both Relational and NoSql as well as object storage and related technologies. • Develop schemas and data models to support high-throughput, large-volume workloads. • Own systems end-to-end—from storage-layer behavior to API design, deployment workflows, and production monitoring. Operations, Observability & Scale • Improve the performance and reliability of one of the world’s largest repositories of container images. • Develop and enhance observability through metrics, traces, alerting, and dashboards. • Lead improvements to deployment and operational tooling (e.g., Argo CD, GitHub Actions). • Participate in on-call rotations as part of supporting critical production services. Leadership & Collaboration • Mentor engineers and lead design and architecture reviews. • Partner with Product, Design, SRE, and Platform teams to deliver high-impact projects. • Engage with open-source communities, cloud-native partners, and the broader ecosystem. Qualifications Required • 12+ years backend engineering experience with deep expertise in distributed systems and large-scale backend architectures. • Strong production experience with Golang, including designing and operating large Go-based services in cloud environments. • Strong production experience with Kubernetes, including operating services at scale. • Experience designing and running high-scale storage systems (PostgreSQL, DynamoDB, or equivalent) in production. • Experience building and operating cloud-based services (AWS preferred). • Experience with event-driven or streaming systems, such as Kafka, SNS/SQS, or equivalent. • Strong foundation in software engineering best practices: design documentation, testing strategies, CI/CD, code review, observability. • Comfortable functioning autonomously in a fully distributed, remote-first team and working effectively in a fast-paced environment. Nice to Have • Experience with OCI registries, artifact stores, or large-scale content distribution systems. • Familiarity with search/indexing systems or metadata-rich architectures. • Contributions to cloud-native or open-source ecosystems. What We’re Looking For • An engineer with deep experience in distributed systems and large-scale storage • Someone who thrives in a fast-paced environment and consistently optimizes for reliability, automation, and performance. • A technical leader who takes ownership of complex systems and drives long-term architectural direction. • An engineer energised by the challenges of operating a global developer platform at massive scale. We use Covey as part of our hiring and / or promotional process for jobs in NYC and certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on April 13, 2024. Please see the independent bias audit report covering our use of Covey here. Perks • Freedom & flexibility; fit your work around your life • Designated quarterly Whaleness Days • Home office setup; we want you comfortable while you work • 16 weeks of paid Parental leave • Technology stipend equivalent to $100 net/month • PTO plan that encourages you to take time to do the things you enjoy • Quarterly, company-wide hackathons • Training stipend for conferences, courses and classes • Equity; we are a growing start-up and want all employees to have a share in the success of the company • Docker Swag • Medical benefits, retirement and holidays vary by country Docker embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. The more inclusive we are, the better our company will be. Due to the remote nature of this role, we are unable to provide visa sponsorship. #LI-REMOTE
This job posting was last updated on 12/5/2025