$120K - 180K a year
Design and optimize PPO reinforcement learning algorithms and integrate them into production AI systems for air traffic radar automation.
4+ years in machine learning with PPO expertise, strong Python programming, experience with RL frameworks like TensorFlow or PyTorch, and a relevant CS or AI degree.
About Us Statheros is a small DEFTECH firm focused on developing cutting-edge AI and autonomy systems for the US Department of Defense. Our team is passionate about building intelligent systems that solve complex problems. We are looking for a talented AI Engineer specializing in Proximal Policy Optimization (PPO) to lead the development of AI-enabled algorithms that automate the operation of air traffic radar systems. Job Responsibilities • Design, implement, and optimize Proximal Policy Optimization (PPO) algorithms for domain-specific use cases. • Develop and train reinforcement learning models for real-world applications, focusing on efficiency and scalability. • Collaborate with cross-functional teams to integrate PPO models into production systems. • Analyze model performance and experiment with hyperparameter tuning to achieve optimal results. • Stay up-to-date with the latest research and advancements in reinforcement learning and apply them to enhance existing solutions. • Build robust pipelines for training, evaluation, and deployment of RL models. • Document workflows, methodologies, and code for reproducibility and knowledge sharing. Qualifications • Educational Background: Bachelor's or Master's degree in Computer Science, Machine Learning, AI, Mathematics, or related fields. Ph.D. is a plus. • Experience: • 4+ years of professional experience in machine learning, with a focus on reinforcement learning. • Demonstrated expertise in implementing and optimizing PPO or similar reinforcement learning algorithms. • Hands-on experience with frameworks like TensorFlow, PyTorch, or JAX. • Technical Skills: • Strong programming skills in Python; familiarity with Rust or other languages is a plus. • Proficiency in designing and running RL experiments in simulated or real-world environments. • Experience with distributed training systems for reinforcement learning. • Solid understanding of policy gradient methods and reinforcement learning theory. • Soft Skills: • Excellent problem-solving skills and the ability to work in a collaborative, fast-paced environment. • Strong communication skills for presenting findings and collaborating with interdisciplinary teams. Preferred Qualifications • Experience in applying PPO to [specific domain, e.g., robotics, gaming, finance, etc.] • Familiarity with OpenAI Gym, RLlib, or other RL development environments • Knowledge of parallel computing and GPU acceleration for large-scale RL tasks What We Offer • Remote work location. • Competitive salary. • Flexible work schedule. • Opportunities for professional development and research contributions • Access to state-of-the-art resources and tools for AI development. • The chance to work on groundbreaking projects with a talented and passionate team.
This job posting was last updated on 8/28/2025