$8000 - 11000 a year
The Role: You will work directly with the founders on advancing the product roadmap and AgentHub’s core evaluation and simulation capabilities. You’ll have significant scope and will keep up to date with the latest state-of-the-art methodologies and techniques across areas like agent evaluation, data generation, and RL - translating these into real features in the hands of real users. What you will do: Design and build the core methodologies and components for evaluating agents across axes like instruction following, safety, groundedness, and efficiency Research, experiment, and implement robust data generation capabilities Tie in the latest advancements in research and productionalize them to unlock value for our customers Signs you might thrive in this role: Working towards a Bachelors/Masters/PhD in Computer Science or related field Passionate about building category-defining products Previous background and experience in reinforcement learning Demonstrated experience in model/agent evaluation is a major plus Opinionated, perpetually curious, and love having scope over a problem and delivering
This job posting was last updated on 10/2/2025