Planet DDS

via Indeed

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Manager, Site Reliability Engineering and Incident Management

Anywhere

Full-time

Posted 11/26/2025

Verified Source

Key Skills:

Site Reliability Engineering

Incident Management

Cloud (AWS, Azure, GCP)

Leadership

Operational Excellence

Monitoring and Logging

Security Best Practices

DNS

Load Balancing

Firewalls

Disaster Recovery

Chaos Engineering

Kubernetes (optional)

Compensation

Salary Range

$120K - 160K a year

Responsibilities

Lead SRE and incident management teams, oversee incident response and operational excellence, and implement process improvements for platform reliability.

Requirements

7+ years in SRE/DevOps with 3+ years managing incident response teams, multi-cloud expertise, strong leadership, and knowledge of security and infrastructure components.

Full Description

Planet DDS is a leading provider of a platform of cloud-based solutions that empowers growth-minded dental businesses. Now serving over 13,000 practices and 118,000 customers in North America, Planet DDS delivers a comprehensive suite of solutions, including Denticon Practice Management, Cloud 9 Ortho Practice Management, and Apteryx Cloud Imaging. Planet DDS is dedicated to enabling dental support organizations (DSOs) and groups to grow and thrive with technology that delivers seamless integrations, improved workflows, and future-proof scalability. We are seeking a Manager, Site Reliability Engineering and Incident Management, to manage our Site Reliability Engineering function as well as our external incident response function for our production operations. To be successful, the manager will need to be self-motivated, communicate clearly, and operate with a sense of urgency in a fast-paced environment. Providing operational support means that you will leverage your customer empathy to production incidents and to any other internal engineering-related support requests. It will be crucial for you to gain a deep understanding of our systems and architecture and build a hands-on knowledge of support and observability tooling. You will need to be available to engage in any incident escalations 24x7. You will need to seek answers from subject matter experts in a variety of positions from architects to support staff, business leaders, and technically minded developers. • Location: East Coast (US) Job Duties • Team Leadership & Development • Lead and mentor a team of SREs and Incident Managers. • Foster a culture of reliability, accountability, and continuous improvement. • Collaborate with engineering teams to design resilient platform architectures. • Incident Management • Oversee the incident response process for outages and service disruptions. • Ensure timely detection, escalation, and resolution of incidents. • Drive post-incident reviews (PIRs) and root cause analysis. • Implement improvements based on lessons learned to prevent recurrence. • Operational Excellence • Mature and enforce best practices for incident response and runbooks. • Automate operational tasks to reduce toil and improve efficiency. • Maintain observability tools (monitoring, alerting, logging). • Process & Governance • Define and maintain incident management policies and escalation procedures. • Drive initiatives for chaos engineering, capacity planning, and disaster recovery testing. Skills and Qualifications • 7+ years in SRE, DevOps, or Infrastructure roles. • 3+ years in Incident Management leadership. • Deep understanding of reliability, scalability, and performance optimization. • Multi-cloud expertise in AWS, Azure, or GCP. • Understanding of DNS, load balancing, firewalls, and compliance frameworks. • Security is part of everything we do and will require your knowledge of fundamental cloud security (e.g., identity and access management, firewalls, etc.) • Deep understanding of logging and monitoring and security best practices • Strong collaboration and communication skills • Bachelor’s Degree in a relevant major or equivalent years of experience is a plus Any of the following would be a plus: • Dental industry knowledge • Experience working in B2B SaaS companies • Experience with cloud containers, specifically Kubernetes PLANET DDS CORE IDEOLOGY Why are we here? Dental software is broken. We aim to fix it. Where are we headed? To be the first choice for growth-minded dental businesses. How do we get there? To encourage measurable progress toward our vision and make the best decisions on behalf of employees and customers, we adopted a set of common values: • Collaborative – Working independently and across teams, we create scalable solutions to enable company growth • Empathetic – We are educated on the experience of our customers and feel vested in their success • Accountable – We feel ownership for the quality of our work and take pride in the positive outcomes • Trustworthy – We operate with integrity and honest, making promises we know that we can keep • Ambitious – We are driven by our ability to make a long-term, positive impact on the lives of dental market leaders Planet DDS is an Equal Opportunity Employer – Including Disability/Veterans

This job posting was last updated on 11/28/2025

JobLogr gets you hired faster

Save $15k

in lost income

Get back 54 hrs + hired 3.5x faster

than average job search

Try for Free

No credit card required

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »