via Lensa
$Not specified
Transform enterprise data into AI-ready knowledge assets and manage knowledge pipelines for AI consumption.
Experience with Databricks, Python, advanced SQL, vector databases, AI knowledge engineering, and data governance.
Role: Knowledge Engineer - Enterprise AI & Agentic Systems Duration: Long Term LOcation: Remote - Occasional visits if required ( New Jersey) Role Summary The Knowledge Engineer owns the transformation of raw enterprise data (structured and unstructured) into governed, reusable, AI-ready knowledge assets that power enterprise-grade GenAI and Agentic AI solutions. This role bridges data engineering, semantic modeling, data governance, and GenAI readiness, ensuring that enterprise agents are built on trusted, secure, governed knowledge foundations. The Knowledge Engineer thinks in terms of how AI consumes knowledge - optimizing for retrieval, reasoning, and agent orchestration, not just storage and transformation Key Responsibilities Enterprise Knowledge Foundation • Convert structured sources (SQL, Delta tables) and unstructured repositories (NetDocuments, PDFs, contracts, emails) into clean, enriched, AI-consumable knowledge assets • Design and implement semantic layers, metadata enrichment frameworks, ingestion pipelines, embeddings pipelines that serve LLM and Agentic AI consumption patterns • Build reusable knowledge abstractions (entity models, ontologies, knowledge graphs) that scale across multiple AI use cases Databricks-Centric Knowledge Engineering • Build and manage end-to-end knowledge pipelines using Databricks • Implement the full pipeline lifecycle: ingestion → transformation → chunking → embedding → indexing → retrieval • Produce AI-consumption-ready knowledge layers with required quality guardrails Governance, Catalog & Lineage • Define & implement enterprise Data & AI governance for knowledge assets: data classification, RBAC/ABAC access controls, PII detection, tagging, masking, and lineage tracking • Ensure all AI knowledge assets meet legal, regulatory, and internal security standards before reaching AI systems Knowledge Architecture for Agentic AI • Design knowledge structures optimized for: RAG pipelines, AI agents • Implement retrieval optimization strategies including hybrid search (vector + keyword + metadata filtering) • Build reusable entity relationships and context orchestration patterns that agents can reliably invoke at runtime AI Platform Collaboration & Enablement • Partner with AI Platform Engineers to expose governed, low-latency knowledge endpoints consumable by agent frameworks and MCP servers • Contribute to prompt-context design by advising on knowledge structure, chunking strategies, and retrieval quality • Act as SME on knowledge quality, helping AI teams debug retrieval failures and hallucination sources Required Skills & Experience Data & Knowledge Engineering - • Strong hands-on experience with Databricks • Advanced SQL and dimensional/semantic data modeling • Python proficiency for pipeline development, transformation logic, and tooling automation • Experience managing unstructured document repositories (like NetDocuments) • Proficiency with vector database setup, configuration, and optimization (e.g., Pinecone, Weaviate, pgvector, Azure AI Search) AI & Semantic Layer • Deep knowledge of RAG architecture - including chunking strategies, embedding model selection, & retrieval evaluation • Hands-on experience with LLM orchestration frameworks like LangChain, LlamaIndex, Microsoft Agent framework etc • Familiarity with embedding optimization techniques and context management for production AI workloads Governance & Security • Experience implementing data lineage tracking • Hands-on with data masking, classification pipelines, and PII handling at scale • Enterprise IAM integration - applying RBAC/ABAC models to data and AI asset access
This job posting was last updated on 3/5/2026