$130K - 170K a year
Lead DevOps/SRE efforts to build and maintain secure, compliant, and automated cloud-native infrastructure and CI/CD pipelines for federal AI/ML workloads.
7+ years DevOps/SRE experience with Kubernetes, Terraform, CI/CD pipelines, security compliance in regulated environments, and ability to obtain Public Trust clearance.
We are Aretec, a leading provider of innovative technology solutions for federal agencies, specializing in cybersecurity, data analytics, and insider threat detection. As a trusted partner, we work closely with our clients to develop and implement cutting-edge strategies that safeguard sensitive information and protect national security interests. Our team of highly skilled professionals is committed to delivering exceptional results while fostering a collaborative and inclusive work environment. You Are An experienced DevOps/SRE leader with deep expertise in cloud-native infrastructure, CI/CD at scale, and secure operations in regulated environments. You bring 7+ years of hands-on DevOps experience and a track record of building reliable, compliant platforms using Infrastructure as Code and Kubernetes. You excel at automating everything from provisioning and deployments to observability and incident response and you collaborate seamlessly with security, data, and application teams. This role is a unique opportunity to deploy AI capabilities at a federal agency, enabling model-powered services with rigorous reliability, performance, and compliance. Your ability to obtain and maintain a Public Trust clearance makes you an ideal candidate for this impactful mission. The Skills • BA/BS degree (or equivalent experience) and 7+ years of DevOps/SRE experience in production, preferably in federal or other regulated environments • Expert with Infrastructure as Code (Terraform), Git-based workflows, and policy-as-code guardrails • Strong hands-on with Kubernetes (cluster operations, autoscaling, service mesh, ingress, secrets, node tuning) and containerization best practices • Proven ownership of CI/CD (e.g., GitHub Actions, GitLab CI, Jenkins): multi-env pipelines, canary/blue-green, artifact registries, SBOMs, and supply-chain security • Observability: metrics, logs, traces (OpenTelemetry), alerting/SLIs/SLOs, on-call readiness, and incident runbooks • Security & Compliance: hardening, STIG/CIS alignment, vulnerability scanning (containers/IaC), zero-trust networking, and audit evidence automation (NIST 800-53/FISMA/FedRAMP context) • Experience supporting AI/ML workloads (e.g., GPU scheduling, model serving endpoints, batch/stream pipelines) a strong plus • Excellent cross-functional communication, stakeholder management, and documentation skills • Ability to obtain and maintain a Public Trust clearance The Expectations30 Days • Take ownership of IaC and CI/CD repositories; standardize branching, code review, and release practices • Baseline production and staging Kubernetes environments; validate RBAC, secrets management, and network policies • Partner with security and application teams to map controls and compliance requirements to pipelines and runtime guardrails • Establish initial observability dashboards and alert thresholds aligned to SLIs/SLOs 60 Days • Implement automated environment provisioning (Terraform modules), immutable images, and progressive delivery (blue-green/canary) • Stand up end-to-end build/test/deploy workflows with supply-chain checks (lint, unit/e2e tests, SAST/DAST, IaC scans, SBOM attestation) • Harden platform per CIS/STIG benchmarks; document evidence and remediation paths for audits • Enable secure lanes for AI deployments (GPU pools, model registry integration, inference scaling, performance baselines) 90 Days • Drive production excellence: optimize cost, reliability, and latency; formalize error budgets and incident response • Build reusable golden pipelines and Terraform modules for rapid onboarding of new services (including AI microservices) • Mentor engineers; publish runbooks, architecture diagrams, and a living Platform Operations Guide • Present strategic recommendations to government leadership to enhance automation, security posture, and AI platform adoption Benefits At Aretec, we support our employees personal and professional growth. We offer a comprehensive benefits package, including: • Competitive salaries and performance-based bonuses • Generous paid time off and holidays • Comprehensive health, dental, and vision insurance • 401(k) plan with employer matching • Professional technical certification opportunities • Flexible work arrangements, when possible This is a remote position, but the schedule is based on Eastern Time Zone. Equal Opportunity Employer Aretec is an Equal Opportunity Employer and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability, or any other legally protected status. All qualified applicants will receive consideration for employment without regard to their protected veteran status and will not be discriminated against based on disability. Due to government regulations, we are only able to consider applicants who are Sole s for employment opportunities within this specific agency. This position requires the candidate to obtain and maintain a Public Trust clearance. Dual citizenship is not permitted.
This job posting was last updated on 10/11/2025