Find your dream job faster with JobLogr
AI-powered job search, resume help, and more.
Try for Free
DC

DecisionPoint | Cortek

via Icims

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Enterprise Service Reliability and Insights Lead

Anywhere
full-time
Posted 11/19/2025
Direct Apply
Key Skills:
Enterprise monitoring systems
Service reliability engineering
ITIL v4 processes
Incident management
Log analysis
Event correlation
Cloud-native monitoring
SLA and KPI definition
Cybersecurity collaboration
Automation scripting

Compensation

Salary Range

$120K - 180K a year

Responsibilities

Define and manage enterprise monitoring strategy, oversee tools and dashboards, establish alerting and escalation protocols, collaborate with engineering and cybersecurity teams, and drive continuous improvement in service reliability.

Requirements

Bachelor's degree in IT or related field, 10+ years in service reliability or IT operations, ITIL v4 and Security+ certifications, experience with monitoring tools, incident management, and DoD/federal IT environments.

Full Description

Overview DecisionPoint seeks a Senior Enterprise Service Reliability and Insights Lead to oversee enterprise-wide monitoring, observability, and operational intelligence for a large federal and DoD-aligned IT environment. This senior-level role defines the monitoring strategy, manages toolsets, develops dashboards, establishes alerting thresholds, and ensures service reliability through proactive detection and rapid incident identification. The Enterprise Service Reliability and Insights Lead is responsible for driving visibility into uptime, system performance, service health, and operational risks. This position partners closely with Tier 2 and Tier 3 engineering teams, cloud operations, cybersecurity, and service desk leadership to ensure monitoring aligns with mission needs, SLAs, and enterprise performance objectives. This position is fully remote. Note: By applying to this position, you acknowledge and consent to having your resume included in an active competitive government contract bid. Duties & Responsibilities The Enterprise Service Reliability and Insights Lead will: Define, implement, and manage the enterprise monitoring and observability strategy. Oversee monitoring tools, dashboards, agents, log pipelines, and alerting configurations across all environments. Establish alert thresholds, escalation criteria, and performance indicators that support proactive issue detection. Ensure monitoring coverage aligns with uptime, performance, and security requirements. Collaborate with Tier 2 and Tier 3 engineering teams on system health assessments, log analytics, and incident triage. Lead efforts to correlate events across application, infrastructure, network, and security monitoring tools. Deliver actionable insights on system reliability, capacity issues, performance bottlenecks, and incident trends. Support SLA and KPI measurement, reporting, and compliance tracking. Maintain monitoring documentation, dashboards, service health definitions, and alerting standards. Partner with cloud, infrastructure, and cybersecurity teams to ensure observability supports mission and compliance needs. Recommend improvements to monitoring architectures, event correlation, and automation capabilities. Participate in incident response activities, root cause analysis sessions, and readiness reviews. Drive continuous improvement initiatives across reliability engineering and service monitoring. Qualifications Clearance Requirement Must hold an active Secret clearance, supported by a Tier 3 background investigation. Education (Required) Bachelor’s degree in Information Technology, Cybersecurity, Systems Engineering, or a related technical field. Experience (Required) Minimum 10 years of experience in service reliability, monitoring engineering, IT operations, or systems engineering. Experience designing or managing enterprise monitoring systems and dashboards. Experience defining SLAs, KPIs, and operational performance measurements. Experience collaborating with Tier 2 and Tier 3 teams for incident management and problem resolution. Experience with log analysis, event correlation, and observability platforms. Technical Knowledge (Required) Strong understanding of monitoring and observability tools (metrics, logs, traces). Knowledge of uptime, performance, and reliability engineering practices. Familiarity with ITIL v4 processes for incident, problem, and change management. Understanding of alerting strategies, threshold design, and escalation workflows. Knowledge of DoD or federal IT operational environments. Technical Knowledge (Preferred) Experience with cloud-native monitoring services and distributed systems monitoring. Experience with APM tools, SIEM integrations, or event correlation engines. Familiarity with automation scripting or analytics for monitoring enhancement. Certifications Required: ITIL v4 Foundation CompTIA Security+ Preferred: Cloud monitoring certifications (AWS, Azure, or similar) SRE or observability-related certifications Skills Strong analytical skills for interpreting system health and service reliability data. Excellent communication and reporting skills for executive and technical audiences. Ability to lead cross-functional coordination during performance events and incidents. High attention to detail with strong documentation habits. Ability to drive continuous improvement across monitoring, reliability, and availability functions. Our Equal Employment Opportunity Policy EEO and Affirmative Action Policy: DecisionPoint Corporation is an Equal Employment Opportunity and Affirmative Action employer. It is the policy of DecisionPoint Corporation to provide equal employment opportunity in accordance with all applicable Equal Employment Opportunity/Affirmative Action laws, directives and regulations to all employees and qualified applicants without regard to race, ethnicity, color, religion, national origin, sex, age, disability status, pregnancy, sexual orientation, gender identity, genetic information, protected veteran status, or any other protected status under Federal, State or Local laws. Pay Transparency Policy: In accordance with Presidential Executive Order 13665, DecisionPoint Corporation will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or (c) consistent with the contractor's legal duty to furnish information. Authorization to Share Resume and Personal Information: By expressing your interest and submitting your resume for this position, you authorize DecisionPoint Corporation to share your resume, as well as personal information included on the resume, with its subsidiaries, affiliates and teaming partners for the purpose of considering you for this position and other available positions requiring comparable skills, education and experience. Should DecisionPoint Corporation. or its affiliates and teaming partners wish to initiate pre-employment discussions, you will be asked to complete an employment application and related employment documents.

This job posting was last updated on 11/22/2025

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »
JobLogr badgeTinyLaunch BadgeJobLogr - AI Job Search Tools to Land Your Next Job Faster than Ever | Product Hunt