$150000 - 200000 a year
About The LLM Data Company The LLM Data Company (YC X25) provides post-training data and RL environments to foundation model labs and frontier applied AI companies. We have raised $3.6m from Tier 1 VCs and are growing 200%+ month-over-month. Responsibilities Design and implement scalable RL recipes for post-training task-specific models Develop modular environments, reward functions, and evaluator scaffolds for internal and customer-facing tasks Drive research at the intersection of scalable infra and modern RL frameworks to enable RL-as-a-service Drive foundational research to publish open source environments and training data Build data generation and curation pipelines to support frontier post-training Collaborate with product teams to deliver a user friendly interface for non-technical users to generate data Qualifications Bachelor or Master in Computer Science or related field Comfort with core tooling (verl, PyTorch, etc.) Familiarity with modern post-training techniques (GRPO, etc.) Experience with evaluations and reward engineering Published in top journals (ICLR, NeurIPS, ICML, etc.) Why you should join Cutting-edge research : Work on unpublished, novel training environments Direct lab exposure : Projects that labs actually use and validate in production High autonomy : Wide design space to propose and run experiments with minimal oversight Early team member : Join as one of the first 10 people with significant equity upside
This job posting was last updated on 9/8/2025