Find your dream job faster with JobLogr
AI-powered job search, resume help, and more.
Try for Free
Prophecy Technologies

Prophecy Technologies

via LinkedIn

Apply Now
All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

SRE / Hadoop Admin

Fountain Valley, CA
contractor
Posted 10/13/2025
Verified Source
Key Skills:
Hadoop ecosystem (HDFS, YARN, Hive, Spark, NiFi, Ambari, Iceberg)
Linux system administration (CentOS/Rocky)
Containerization (Docker, Kubernetes)
Automation and Infrastructure as Code (GitLab CI/CD, Python, bash scripting)
Monitoring and observability (Prometheus, Grafana, OpenTelemetry)
Distributed systems and large-scale data platform management
Security controls and compliance
Disaster recovery and data governance

Compensation

Salary Range

$130K - 180K a year

Responsibilities

Manage and optimize a petabyte-scale on-prem Hadoop data platform ensuring high availability, performance, security, and automation.

Requirements

10+ years managing Hadoop infrastructure with strong Linux admin, container orchestration, automation, monitoring, security, and leadership skills.

Full Description

Purpose: Seeking a highly experienced Senior or Lead Platform Engineer/Site Reliability Engineer (SRE)/Hadoop Admin to manage and enhance our petabyte-scale, on-premises data platform. This platform is built using the open-source Hadoop ecosystem. The ideal candidate possesses in-depth technical expertise, a solid understanding of distributed systems, and extensive experience in operating and optimizing large-scale data infrastructures. This role requires a hands-on technical leader who can drive platform innovation, ensure high availability and reliability, and mentor team members in best practices for performance, automation, and resiliency. Essential Functions: • Own and operate the end-to-end infrastructure of a large-scale, on-prem Hadoop-based data platform, ensuring high availability and reliability. • Design, implement, and maintain core platform components, including Hadoop, Hive, Spark, NiFi, Iceberg, ELK, OpenSearch and Ambari. • Automate infrastructure management, monitoring, and deployments using CI/CD pipelines (GitLab) and scripting. • Implement and enforce security controls, access management, and compliance standards. • Perform system upgrades, patching, performance tuning, and troubleshooting across platform components Basic Requirements: • 10+ years of experience in Platform Engineering, Site Reliability Engineering, or similar roles, with proven success managing large-scale, distributed Hadoop infrastructure. • Deep expertise in the Hadoop ecosystem, including HDFS, YARN, Hive, Spark, NiFi, Ambari, and Iceberg. • Strong Linux system administration skills (CentOS/Rocky preferred), including system tuning, performance optimization, and troubleshooting. • Proficiency in containerization and orchestration using Docker and Kubernetes. • Solid experience with automation and Infrastructure as Code, leveraging tools like GitLab CI/CD and scripting in Python and bash. • Practical knowledge of monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry) and understanding of system health, alerting, and telemetry. • Familiarity with networking concepts, security protocols, and data compliance requirements. • Experience managing petabyte-scale data platforms and implementing disaster recovery strategies. • Understanding of data governance, metadata management, and operational best practices. • Demonstrated ability to lead technical projects, mentor engineers, and collaborate effectively with cross-functional teams. • Excellent problem-solving, communication, and leadership skills. Regards,

This job posting was last updated on 10/14/2025

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »
JobLogr badgeTinyLaunch BadgeJobLogr - AI Job Search Tools to Land Your Next Job Faster than Ever | Product Hunt