via Gem
$90K - 130K a year
Design, develop, and improve AI retrieval, ranking, and agentic reasoning systems to enhance the personalized AI teammate's performance.
Strong OOP programming in Python, TypeScript, or C++; deep understanding of information retrieval and RAG pipelines; experience with vector databases and re-ranking models; data-driven experimentation mindset.
Role Overview We're seeking an Applied AI Engineer to join our fast-paced team to design, develop, and implement new features for Littlebird, our personalized AI teammate for Mac and Android. We’re building personal AI; a thought partner that connects dots in your work and helps you think. We're a small, async-first team that values craft and ownership. The Role This role is about making our agent smarter, faster, and more reliable. You will live in the core of our AI, obsessing over the quality of our retrieval, the precision of our ranking, and the logic of our agent. Some of the hard problems you'll solve: Master the Art of Retrieval & Re-ranking: Our current hybrid search pipeline (BM25 + vector + RRF + re-ranker) is functional but has clear limitations. You will own its evolution. Solve the "Broad Query" Problem: How do you make retrieval work just as well for "what did I do last week?" as it does for a specific, targeted question? This involves query analysis, decomposition, and potentially multiple retrieval strategies. Optimize the Ranking Stack: You'll experiment with and productionize new re-ranking models to crush our latency bottlenecks. You'll fine-tune our Reciprocal Rank Fusion (RRF) strategy to better blend sparse and dense retrieval signals. Develop Intelligent Pruning: How do you shrink the context passed to the LLM by 80% without losing the critical 1% of information that leads to the right answer? You'll design and test sophisticated context pruning and summarization techniques. Engineer Better Agentic Reasoning: Our agent uses a multi-step, tool-calling approach to solve problems. Debugging and improving it is a core challenge. Context Engineering: You'll become an expert in "prompt-level" performance, figuring out the optimal way to structure and present context to the LLM to minimize hallucinations and improve reasoning. Debugging Complex Agentic Flows: You'll be a detective, tracing the root cause of agent failures through layers of tool calls, context retrieval, and LLM responses to understand where things went wrong and how to fix them. What we're looking for: Strong OOP programming skills and prior experience writing production software in python, Typescript, or C++ A deep, intuitive understanding of information retrieval and modern RAG pipelines. You've likely built a few from scratch. Hands-on experience with vector databases, hybrid search, and re-ranking models. An experimental, data-driven mindset. You're comfortable running A/B tests, analyzing metrics, and iterating quickly to improve performance. A pragmatic approach. You're focused on shipping tangible improvements to the AI's quality, not just chasing SOTA benchmarks. Benefits Remote-friendly work environment Collaborative team culture Opportunity to shape infrastructure decisions Competitive compensation packages including stock and health benefits, paid time off, and parental leave Flexible working hours across multiple time zones
This job posting was last updated on 11/26/2025