via Indeed
$120K - 160K a year
Lead and execute a unified monitoring strategy including Splunk deployment, cross-functional enablement, governance, and vendor management in enterprise infrastructure environments.
5+ years technical program management experience with proven Splunk and Azure Monitoring deployment, HPC infrastructure knowledge, observability tools familiarity, and relevant certifications (PMP, CSM, Splunk Admin/Architect).
Engaging with the implementation and engineering teams Typical Day in the Role • Purpose of the Team: Drive strategic program leadership for unified monitoring strategy, ensuring governance, optimization, and vendor/stakeholder engagement. • Key Projects: o Splunk deployment and integration o Azure monitoring implementation o Incident management and documentation/training initiatives • Typical Task Breakdown & Operating Rhythm: o Lead technical program management within enterprise infrastructure environments o Collaborate with cross-functional teams for governance and optimization o Execute monitoring solutions and deliver quick wins (e.g., Splunk deployment) o Engage in stakeholder communication and training sessions Strategic Program Leadership • Define and execute a multi-phase Splunk deployment strategy aligned with organizational goals. • Drive program governance, OKRs, and risk management for global observability initiatives. Unified Monitoring Strategy • Partner with Infrastructure – Linux, Scheduler, Storage, ETX and Cloud teams to establish a cohesive monitoring framework for compute, storage, and network layers. In addition, collaborate with other stakeholders to provide visibility into the environment. • Align observability metrics with SLOs, SLIs, and incident response objectives. Splunk Deployment & Integration • Lead the deployment, configuration and integration of Splunk with existing systems, ensuring scalability and compliance. Cross-Functional Enablement • Collaborate with engineering and operations to onboard data sources, standardize alerting, and deliver actionable dashboards. • Champion best practices for proactive monitoring and automated remediation. Governance & Optimization • Establish KPIs, retention policies, and compliance standards for observability data. • Continuously optimize ingestion, indexing, and search performance for cost efficiency. Vendor & Stakeholder Engagement • Manage relationships with Splunk and third-party vendors for licensing, support, and roadmap alignment. • Communicate program progress, risks, and outcomes to executive stakeholders. Incident Management Support • Enable observability platforms to accelerate root cause analysis and reduce MTTR through predictive analytics and automation. Documentation & Training • Deliver comprehensive documentation and enablement programs for operational teams. Proven success deploying and managing Splunk (Enterprise or ITSI), Azure Monitoring at scale, Experience with High Performance Environment/ Infrastructure Qualifications • 5+ years in technical program management within enterprise infrastructure environments. • Proven success deploying and managing Splunk (Enterprise or ITSI), Azure Monitoring at scale. • Strong knowledge of HPC environment, infrastructure components and hybrid cloud architecture. • Familiarity with observability tools (Prometheus, Grafana, Datadog, Dynatrace). • Exceptional communication and stakeholder management skills. • Knowledge of SRE principles and incident management practices. • Certifications: PMP, Certified Scrum Master (CSM), Splunk Certified Admin/Architect. Preferred • Experience with automation/orchestration (Terraform, Ansible, CI/CD). • Background in Azure, AWS or GCP integrations.
This job posting was last updated on 12/9/2025