Find your dream job faster with JobLogr
AI-powered job search, resume help, and more.
Try for Free
Cribl

Cribl

via Remote Rocketship

Apply Now
All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Staff Software Engineer – Alerting, Observability

Anywhere
full-time
Posted 10/6/2025
Verified Source
Key Skills:
TypeScript
Node.js
PromQL
SQL
React
Prometheus
AlertManager
Grafana
Distributed Systems
CI/CD

Compensation

Salary Range

$120K - 180K a year

Responsibilities

Design and build scalable alerting systems with query-based rules, intelligent routing, and frontend interfaces for monitoring and incident detection.

Requirements

Strong TypeScript/Node.js skills, experience with query languages like PromQL and SQL, alerting systems, time-series databases, React frontend, observability tools, and distributed systems knowledge.

Full Description

Description: • Design and build sophisticated alerting systems that enable proactive monitoring and incident detection across distributed systems • Develop query-based alert rules and expressions using PromQL, SQL, and other query languages to surface meaningful insights • Create intelligent alert routing, deduplication, and correlation mechanisms to reduce noise and improve signal quality • Build scalable backend services for alert evaluation, notification delivery, and alert management workflows • Optimize time-series data storage and query performance for high-volume metrics and telemetry data • Develop intuitive interfaces for alert configuration, visualization, and management using React and modern frontend technologies • Collaborate with cross-functional teams to understand monitoring requirements and deliver comprehensive alerting solutions • Mentor and guide engineers on best practices for observability and alerting architecture Requirements: • Strong proficiency in TypeScript/Node.js with a proven track record of building production-grade services • Experience with query languages for metrics and monitoring (PromQL, SQL, or similar) and ability to write complex queries for data analysis • Hands-on experience building or maintaining alerting systems, including rule evaluation engines and notification pipelines • Experience with time-series databases and columnar storage systems (ClickHouse experience is a plus) • Frontend development skills with React and modern JavaScript frameworks for building data visualization and management interfaces • Strong understanding of distributed systems, data structures, and algorithms • Experience with observability concepts including metrics, logs, traces, and their correlation • Ability to work independently with minimal supervision and a track record of learning quickly • Dedication to writing clean, maintainable, and well-tested code • Prometheus ecosystem, including AlertManager • Background in building rule engines or expression evaluation systems • Experience with notification systems and integrations (PagerDuty, Slack, webhooks, etc.) • Familiarity with observability tools like Grafana, ELK stack, or similar solutions • Experience with CI/CD pipelines such as BitBucket, Jenkins, CircleCI, etc. • Understanding of alert fatigue mitigation strategies and intelligent alerting patterns • Experience with high cardinality data and performance optimization • Willingness to speak your mind and share ideas • Appreciation for humor and a love for goats • Comfort working remotely Benefits: • health, dental, vision insurance • short-term disability and life insurance • paid holidays and paid time off • fertility treatment benefit • 401(k) • equity • eligibility for a discretionary company-wide bonus

This job posting was last updated on 10/11/2025

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »
JobLogr badgeTinyLaunch BadgeJobLogr - AI Job Search Tools to Land Your Next Job Faster than Ever | Product Hunt