via Remote Rocketship
$90K - 130K a year
Design and optimize scalable data pipelines and event-driven architectures for high-throughput, low-latency systems.
Requires advanced Python skills, experience with big data tools like Apache Spark, Beam, Airflow, and knowledge of distributed systems and NoSQL databases.
Job Description: • Design, build, and optimize scalable data pipelines for batch and real-time processing • Develop and maintain event-driven architectures for high-throughput systems • Ensure data reliability, performance, and low-latency processing across distributed environments • Collaborate with data scientists and application teams to enable analytics and AI use cases • Implement best practices in performance tuning, monitoring, and cost optimization Requirements: • Advanced proficiency in Python for backend and large-scale data processing • Strong experience building and managing big data pipelines in production environments • Hands-on expertise with workflow orchestration tools such as Airflow or Google Cloud Composer • Proven experience in batch and streaming data processing using: Apache Spark Apache Beam (Dataflow) • Experience designing and operating event-driven systems using Pub/Sub • Strong understanding of distributed systems architecture and scalability patterns • Experience managing globally distributed, low-latency datasets • Hands-on experience with NoSQL databases and/or Google Cloud Spanner • Strong knowledge of system reliability, fault tolerance, and performance optimization Benefits:
This job posting was last updated on 2/23/2026