Find your dream job faster with JobLogr
AI-powered job search, resume help, and more.
Try for Free
ExecutivePlacements.com

ExecutivePlacements.com

via LinkedIn

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Senior Observability Platform Engineer

Washington, DC
full-time
Posted 11/24/2025
Verified Source
Key Skills:
Prometheus
Alertmanager
OpenTelemetry
Grafana
Kubernetes
Go
Python
Shell scripting
Infrastructure-as-code
Terraform
ELK stack
Zabbix

Compensation

Salary Range

$120K - 200K a year

Responsibilities

Design, develop, implement, and maintain a comprehensive observability stack and monitoring systems for cloud and data center environments.

Requirements

Bachelor's degree in CS or related field, proven experience with observability tools, Kubernetes, scripting languages, and infrastructure-as-code tools.

Full Description

Job Description The position offers an exciting opportunity for software engineers passionate about open source software, Linux, Kubernetes, and Observability. The monitoring stack will provide comprehensive monitoring across system metrics, database performance, network health, and message queues. It will also oversee applications running on diverse cloud platforms, including Kubernetes and ESXi, as well as on bare-metal servers, virtual machines, and containers in the SS&C Private Cloud. Responsibilities • Responsible for designing, developing, implementing, and maintaining our comprehensive observability stack, including tracing, telemetry, logging, health monitoring, visualization, and dashboards. You will play a key role in ensuring the reliability, performance, and operational efficiency of our services. • Design and implement a robust observability framework using composable open source solutions like Prometheus, Alertmanager, OpenTelemetry, Grafana, Alloy, Loki, Promtail, Tempo, Thanos, ELK stack, Zabbix, and similar. • Develop and maintain health monitoring and alerting systems for our compute platforms, databases, network infrastructure as well as Kubernetes-based platforms including GPU-supported environments. • Create and manage visualization dashboards to monitor system performance, resource utilization, and operational health. • Implement scalable, distributed logging and tracing solutions to diagnose, troubleshoot, and resolve system issues effectively. • Collaborate with development and operations teams to integrate observability practices into the development lifecycle. • Conduct performance analysis and optimization to ensure system reliability and efficiency. • Stay updated with the latest trends and technologies in observability and performance monitoring. • Collaborate with cross-functional teams (Cloud Engineering, Network, and DevOps/Solutions Engineering) to troubleshoot and resolve infrastructure issues. Preferred Qualifications • Proven experience in observability, system and network monitoring, and system performance analysis, particularly in a cloud or data center environment. • Expertise in implementing and managing observability tools and technologies such as composable open source solutions like Prometheus, Alertmanager, OpenTelemetry, Grafana, Alloy, Loki, Promtail, Tempo, Thanos, ELK stack, Zabbix, and similar commercial solutions. • Hands-on experience with Kubernetes. • Experience with infrastructure-as-code and configuration management tools such as Consul, GitHub, Salt Stack, Terraform, etc. • Proficiency in scripting and automation using languages such as Go, Python, Shell. • Excellent problem-solving skills and the ability to work independently or as part of a team. • Strong communication skills and the ability to work in a fast-paced, dynamic environment. Educational Qualifications • Bachelors or Masters degree in Computer Science, Information Technology, or a related field.

This job posting was last updated on 11/24/2025

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »
JobLogr badgeTinyLaunch BadgeJobLogr - AI Job Search Tools to Land Your Next Job Faster than Ever | Product Hunt