Find your dream job faster with JobLogr
AI-powered job search, resume help, and more.
Try for Free
General Dynamics Information Technology

General Dynamics Information Technology

via Workday

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

SYSTEMS ENGINEER SR PRINCIPAL (HPC/AI System Administrator, Storage Engineer, Monitoring Expert, Solution Architect, Security/Provisioning Engineer, or Multi-discipline Expert)

Anywhere
Full-time
Posted 12/5/2025
Verified Source
Key Skills:
Linux system administration
Python scripting
Perl scripting
Bash scripting
Networking (Ethernet, InfiniBand, Slingshot)
TCP/IP networking
PBSpro/SLURM batch systems
System performance monitoring (Grafana, Prometheus)
C/C++ programming
Fortran programming

Compensation

Salary Range

$120K - 180K a year

Responsibilities

Lead and manage day-to-day operations and sustainment of HPC clusters supporting NOAA's weather forecasting systems, ensuring high availability, performance, and customer satisfaction.

Requirements

10+ years experience with Linux HPC system administration, scripting, networking, batch system administration, and system performance monitoring, with US citizenship and security clearance eligibility.

Full Description

SYSTEMS ENGINEER SR PRINCIPAL (HPC/AI System Administrator, Storage Engineer, Monitoring Expert, Solution Architect, Security/Provisioning Engineer, or Multi-discipline Expert) Advance how our customers operate while you advance your career. Join GDIT as a Systems Engineer Sr Principal for High Performance Computing (HPC) and build an impactful career in enterprise IT, collaborating with people who are driven and resourceful like you. MEANINGFUL WORK AND PERSONAL IMPACT As a Systems Engineer Sr Principal, the work you’ll do at GDIT will be impactful to the mission of National Oceanagraphic and Atmospheric Administration (NOAA) National Weather Service (NWS). You will play a crucial role in supporting the full lifecycle sustainment and operational availability of leading edge High Performance Computing (HPC) clusters that are the key elements of the Weather & Climate Operational Supercomputing System (WCOSS) used 24/7 by the National Centers for Environmental Prediction (NCEP) Central Operations (NCO). ● Lead/Manage/Support the day-day operations, sustainment, HPC services delivery, and incremental enhancements of two, geographically separated HPC clusters that are GDIT contractor owned and contractor operated (COCO) and used exclusively for WCOSS. This position will be essential in maintaining complex HPC service availability and delivery for intricate customer workload processing and output specifically aligned to forecasting and predictions from the Global Forecast System (GFS) and supporting models. ● Collaborate with the GDIT WCOSS team as a senior-level HPC functional expert addressing intricate and multifaceted HPC challenges by providing innovative ideas, solutions, and resolution for customer requests, issues, and improvement efficiencies on a continuous basis. ● Drive and prioritize resource utilization towards continuously improving customer satisfaction with GDIT's HPC service delivery and exceeding the contract service level metrics of uptime, availability, performance, stability, and on-time product delivery. ● Utilize past experience, team collaboration, system management and troubleshooting applications, and ingenuity to support customer operations while working on systems that range in capacity from 1000-3000+ nodes and 100's of PB storage per system. WHAT YOU’LL NEED TO SUCCEED Bring your technology expertise and drive for innovation to GDIT. The Systems Engineer Sr Principal must have: ● Education: Bachelor of Arts/Bachelor of Science ● Experience: 10+ years of related experience ● Technical skills: Highly proficient with Linux (RockyOS, SLES, etc), scripting in Python, Perl, or Bash, networking concepts and technology such as Ethernet, InfiniBand and Slingshot, TCP/IP networking, basic routing, and network services, programming in Python, C/C++, or Fortran, administrating PBSpro, SLURM or other batch systems in an HPC cluster, and system performance monitoring and tuning in an HPC cluster environment (e.g., Opensearch, Grafana, Prometheus) ● Security clearance level: must complete a satisfactory background investigation ● US citizenship required ● Role requirements: Expected to perform as individual SME contributor, functional lead, or project/task leader responsible for workproduct delivery. Extensive experience in troubleshooting, diagnosing and repairing hardware failures to component level on servers; coordinating with vendors to resolve hardware and software problems. Minimal travel required for onsite work, team collaboration, training, and customer interaction. GDIT IS YOUR PLACE At GDIT, the mission is our purpose, and our people are at the center of everything we do. ● Growth: AI-powered career tool that identifies career steps and learning opportunities ● Support: An internal mobility team focused on helping you achieve your career goals ● Rewards: Comprehensive benefits and wellness packages, 401K with company match, and competitive pay and paid time off ● Flexibility: Full-flex work week to own your priorities at work and at home as part of an onsite and distributed remote team with as part of an onsite and distributed remote team with opportunities for in-person collaboration. ● Community: Award-winning culture of innovation and a military-friendly workplace OWN YOUR OPPORTUNITY Explore an enterprise IT career at GDIT and you’ll find endless opportunities to grow alongside colleagues who share your desire to drive operations forward.

This job posting was last updated on 12/10/2025

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »
JobLogr badgeTinyLaunch BadgeJobLogr - AI Job Search Tools to Land Your Next Job Faster than Ever | Product Hunt