$120K - 160K a year
Design, automate, and maintain GCP cloud infrastructure with a focus on scalability, security, observability, and cost optimization.
5+ years in DevOps or Cloud Infrastructure Engineering with strong GCP, Kubernetes, Terraform, and monitoring tool expertise.
About the job Role Overview We’re hiring a Senior DevOps Engineer to design, secure, and scale our cloud infrastructure that powers critical patient- and provider-facing systems. You will be responsible for ensuring high availability, observability, and compliance across our platform. This role is ideal for someone who thrives at the intersection of automation, reliability, and security. You’ll collaborate closely with software engineers, security specialists, and product teams to build robust, efficient, and compliant cloud systems. Key Responsibilities • Design, automate, and maintain GCP infrastructure supporting production, staging, and development environments. • Implement and manage monitoring, logging, and alerting using GCP Cloud Monitoring, Cloud Logging, Prometheus, and Grafana. • Manage infrastructure as code (IaC) using Terraform and GitOps workflows through ArgoCD. • Administer Kubernetes (GKE) clusters with focus on scalability, reliability, and security using containerd. • Configure and manage IAM, Secret Manager, SCC (Security Command Center), and Cloud Armor for strong access control and threat mitigation. • Define and enforce resource quotas, optimize utilization, and support cost management across GCP workloads. • Implement backup and recovery strategies to ensure data durability and business continuity. • Maintain system observability through custom dashboards, logs, and metrics. • Work closely with development teams to design CI/CD pipelines and promote DevSecOps best practices. • Proactively identify performance bottlenecks and reliability gaps, proposing automated solutions. • Participate in on-call rotations and incident management processes to ensure system uptime. • Establish best practices for high availability, scaling, and infrastructure resilience. Qualifications Required: • 5+ years of experience in DevOps, SRE, or Cloud Infrastructure Engineering roles. • Strong expertise with Google Cloud Platform (GCP) including Monitoring, Logging, Alerts, SCC, IAM, and Secret Manager. • Hands-on experience managing Kubernetes (GKE) clusters and containerd runtimes. • Proficiency with Terraform and ArgoCD for infrastructure provisioning and GitOps workflows. • Experience implementing Cloud Armor, resource quotas, and cost optimization in GCP environments. • Solid understanding of Prometheus and Grafana for observability and monitoring. • Proven ability to design and implement backups, failover, and disaster recovery strategies. • Familiarity with DevSecOps principles and secure infrastructure design. • Strong problem-solving skills and ability to work in fast-paced, high-reliability environments. Preferred: • Experience with secure networking including VPC design, routing, and firewalls. • Background in incident response and postmortem analysis. • Exposure to chaos testing and resilience validation. Familiarity with Policy as Code (PaC) frameworks such as OPA (Open Policy Agent)
This job posting was last updated on 10/21/2025