via Lensa
$120K - 160K a year
Lead end-to-end design and implementation of scalable, secure Generative AI solutions.
Expertise in AI frameworks, model providers, vector systems, serving infrastructure, LLMOps, and safety tools.
Are you passionate about building cutting-edge Generative AI solutions that are scalable, secure, and enterprise-ready? We're looking for a Generative AI Solution Architect to lead end-to-end AI solution design - from business discovery through architecture, implementation, and ongoing operations. In this role, you'll work across engineering, data, product, and security teams to build, scale, and govern AI systems. If you thrive at the intersection of LLM architectures, RAG systems, agentic workflows, and enterprise integration, this is the opportunity for you. What You'll Do End-to-End AI Architecture • Translate business needs into scalable, secure, cost-effective AI architectures. • Own solution design from ideation to deployment: data strategy, model selection, workflow design, and operational readiness. • Create architecture diagrams, sequence flows, deployment topologies, and technical design documentation. RAG, Agents & Model Engineering • Architect Retrieval-Augmented Generation (RAG) systems: ingestion, chunking, embeddings, hybrid search, re-ranking, caching, and information freshness. • Design and implement agentic systems using tool/function calling, planner-executor, and multi-agent patterns. • Define and implement advanced prompt architecture (ReAct, CoT alternatives, structured output, routing, template versioning). • Evaluate and integrate both closed and open-weight LLMs with routing, fallback, and cost optimization. • Develop fine-tuning strategies (LoRA/QLoRA, PEFT, adapters) and define evaluation criteria. Performance, Serving & Integration • Architect low-latency, high-throughput serving patterns using batching, caching, speculative decoding, quantization, and efficient GPU/CPU routing. • Design APIs, SDKs, and reusable platform components for cross-team AI adoption. • Integrate GenAI systems with enterprise identity, authorization, data platforms, event streams, and end-user applications. LLMOps, Governance & Safety • Establish LLMOps practices for prompt versioning, dataset management, experiment tracking, evaluation pipelines, and regression testing. • Implement observability for quality, hallucination detection, safety violations, latency, throughput, and cost. • Design safety and governance guardrails: PII redaction, content filtering, jailbreak defense, and audit logging. • Conduct risk assessments related to compliance, security, vendor lock-in, performance, and operational stability. Leadership & Standards • Lead technical discovery workshops and architecture reviews. • Mentor engineers on GenAI patterns, design tradeoffs, and best practices. • Maintain reference architectures, ADRs, design standards, and reusable blueprints. • Continuously drive improvements in reliability, scalability, and cost efficiency across AI solutions in production. Preferred Technical Skills • GenAI Frameworks: LangChain, LangGraph • Model Providers: Azure OpenAI, Anthropic, Google, AWS Bedrock, Snowflake Cortex AI • Vector & Retrieval Systems: OpenSearch, Pinecone, pgvector, Weaviate, Neo4j, Neptune • Serving Infrastructure: vLLM, TGI, Ray Serve • LLMOps: Dataiku, MLflow, LangSmith, Weights & Biases • Safety Tools: Azure AI Content Safety, Guardrails, NeMo Guardrails • Agent Platforms: LangGraph, AutoGen, CrewAI
This job posting was last updated on 3/6/2026