via Wellfound
$120K - 180K a year
Build and own the core Retrieval-Augmented Generation platform including frontend, backend, document processing, and future agent features.
4+ years full-stack experience with React/Next.js frontend, Node.js/Python backend, production APIs, SQL databases, cloud deployment, and exposure to LLM-based or search/RAG systems.
Location: Remote (US-preferred) Company: Bucket Labs / Mimir (Newmimir.com) About Us Bucket Labs is building Mimir, a private, high-accuracy Retrieval-Augmented Generation (RAG) platform that makes complex business data searchable, usable, and intelligent. Think: “private AI for your files and data,” built for teams that care about control, security, and cost, not just another wrapper around a hyperscaler API. We work with sensitive documents and high-stakes decisions, so we care about security, reliability, and quality just as much as we care about speed. We’re a small, sharp, execution-focused team. If you like owning problems end-to-end, moving fast without breaking trust, and shipping things real users touch every day, you’ll fit right in. What We’re Building Our first priority is to build a best-in-class RAG workspace (NotebookLM-level and beyond): ● Upload and organize documents across folders / workspaces ● High-accuracy, source-grounded answers with citations ● Strong retrieval quality, good UX for multi-document reasoning ● Built to be cost-efficient and cloud-resilient, with smart use of open-source components On top of that, we’ll layer agentic workflows once the core RAG is excellent, starting with: ● Agents that use the RAG stack to work across email, CRM, and internal tools ● Assistants that understand your organization’s documents + data and can act on them What You’ll Own You’ll build the core RAG product and lay the groundwork for future agents. RAG Platform (Top Priority) ● Build core product experiences across the stack: ○ User-facing application (React/Next.js) ○ APIs, data models, and retrieval logic on the backend ● Design and implement features for: ● Document upload, parsing, and search ● Chat over documents with citations, highlighting, and good UX ● Document organization, folders, shared folders, and saved sessions ● User / team management and permissions ● Internet search for non database related queries ● On device application (for advance security and parsing) ● Improve retrieval quality: ○ Chunking strategies, indexing, and ranking ○ Evaluations to measure and iterate on RAG performance Future: Agents on Top of RAG ● Once the RAG foundation is strong, help us build agents on top of it for: ○ Email (read threads, fetch relevant docs, draft responses) ○ CRM and internal systems (summaries, updates, task creation, etc.) ● Design APIs and data models so the RAG layer is a clean platform our agents can call into. Foundations You Bring Open-Source & Cost-Efficient AI Stack ● Familiarity with open-source alternatives to managed services like Textract and proprietary “knowledge base” tools (e.g., OCR, document parsing, embedding models, vector stores). ● Understanding of strategies to reduce hyperscaler dependency and improve cost efficiency, such as: ○ When and how to self-host open-source components ○ Designing modular abstractions to enable provider/model flexibility ● Exposure to OSS tools and libraries (OCR, NLP, vector DBs, RAG frameworks) and how they integrate into modern AI stacks. Reliability, Scale & Engineering Culture ● Awareness of production quality principles: observability, logging, basic SLOs, and post-incident learning. ● Understanding of scaling strategies for Postgres/SQL, including future-facing concepts like sharding and multi-tenancy. ● Appreciation for performance tuning and cost-conscious infrastructure decisions. ● Familiarity with engineering best practices: code reviews, testing, documentation, and developer experience as a team scales. ● Familiarity with Knowledge Graph Tech Stack (Current & Planned) You don’t need all of these, but strong overlap is ideal: ● Frontend: React, Next.js, TypeScript, Tailwind or similar ● Backend: Node.js and/or Python (FastAPI/Express/Nest), REST/GraphQL APIs ● Data & Storage: Postgres (or MySQL), Redis, object storage (S3 or equivalent) ● Infra: AWS (Lambda, ECS/Fargate, S3, API Gateway, CloudWatch, etc.) or similar cloud ● AI / RAG (Must Have): ○ Vector databases (pgvector, Weaviate, Milvus, Pinecone, etc.) ○ Embeddings & LLM APIs (Bedrock, OpenAI-style, local models) ○ Open-source OCR / document parsing / RAG frameworks (e.g., Tesseract, LayoutParser, Haystack, LangChain/LlamaIndex, pymuPdf, etc.) You Might Be a Great Fit If... ● You’ve owned significant product features end-to-end at a startup or fast-moving team (from frontend through backend into prod). ● You’ve built or contributed to search / RAG / document intelligence systems in production, not just quick demos. ● You’re comfortable going from “vague idea” → structured design → shipped feature with minimal hand-holding. ● You write clean, well-structured code and care about naming, tests, and maintainability, not just “making it work.” ● You think in systems and tradeoffs, and can explain your decisions in plain English (performance vs. cost, OSS vs. managed, etc.). ● You’re excited by the idea of replacing expensive managed AI services with open-source, well-architected solutions. ● You’re hungry to be early at something big: you want real impact, not just tickets in a backlog. Requirements ● ~4+ years of professional experience as a full-stack engineer (Exeptions will be made with). ● Strong experience with modern frontend (React/Next.js). ● Strong experience with backend (Node.js and/or Python) building production APIs and services. ● Solid grounding in web fundamentals: HTTP, APIs, auth, security basics, performance. ● Comfortable with SQL and designing data models that don’t fall apart on v2; experience with Postgres or similar. ● Hands-on experience deploying to AWS/GCP/Azure or a similar cloud environment. ● Prior exposure to LLM-based or search/RAG systems (can be at work or serious side projects). ● Excellent communication skills; can work async and still keep everyone in the loop. Nice-to-Haves Experience with: ● Open-source AI tooling (OCR, document parsers, embedding models, vector DBs) and integrating them in production. ● RAG in production: evaluation strategies, retrieval tuning, long-context handling. ● Reducing hyperscaler dependence: swapping out managed services for self-hosted or OSS alternatives without breaking the product. ● Multi-tenant SaaS architecture, role-based access control, and enterprise security. ● Scaling relational databases (Postgres/SQL), including partitioning/sharding or planning for it. ● Prior founding engineer / early-stage startup experience. What We Offer ● A founding-team level seat: meaningful influence on product, architecture, and culture. ● Competitive salary + equity upside. ● Direct access to founders, advisors, and customers, no layers of bureaucracy. ● A culture that values ownership, entrepreneurship, and high standards at the same time. ● Remote flexibility with the option to connect in person. How to Apply Send us your GitHub / portfolio, LinkedIn, and a short note about: • A RAG / search / document-intelligence system or feature you owned end-to-end (what it did, the stack, and what you’re proud of). • A time you replaced or avoided an expensive managed service using open-source or self-hosted components, and what tradeoffs you made. • US-based and eligible to complete a W-9 (legal authorization to work in the United States required) Email: Cameron@bucketlabs.ai Subject: “Senior Full-Stack - Mimir”
This job posting was last updated on 12/1/2025