via Remote Jobs
$80K - 140K a year
Design and operate scalable data and clean room architectures with AI/ML integration and partner onboarding.
Requires 5+ years data engineering with Snowflake/Databricks, advanced SQL/Python, ELT pipelines, clean room knowledge, and AI/LLM experience.
Job Description: • Define partner onboarding and clean room architecture patterns across Snowflake, LiveRamp, and Databricks that are secure, scalable, and repeatable. • Configure and manage partner-specific clean room environments; deploy and manage Python-based libraries within the platform ecosystem. • Establish and maintain MLOps practices, including model serving, monitoring, and pipeline orchestration for AI/ML features deployed within the platform ecosystem. • Own design and enforcement of granular RBAC policies and least-privilege service accounts. • Serve as the technical lead for onboarding new partners, implementing privacy-preserving controls (e.g., aggregation thresholds and anonymization techniques). • Design, build, and operate scalable ELT pipelines using Snowpark and/or PySpark and advanced SQL to provision Gold datasets. • Implement and evolve identity resolution logic mapping internal data to 3P identifiers (including LUIDs, RampIDs, TransUnion IDs), ensuring privacy-safe practices. • Design and operate scalable data architectures across Snowflake and Databricks supporting batch and near real-time processing patterns. • Build robust automated checks (e.g., Great Expectations or custom SQL assertions) and define quality standards to detect schema drift, null rate spikes, and volume anomalies. • Lead performance optimization across platforms (query tuning, caching, incremental processing) and define and implement query tagging and chargeback models for accurate cost attribution. • Establish monitoring, alerting, runbooks, and standard operating procedures to improve platform reliability and reduce incident time-to-resolution. • Validate that output data adheres to privacy and business requirements, and define test strategies for partner-facing releases. • Serve as the escalation point for diagnosing connection failures, data discrepancies, or latency issues with partner technical teams. • Design and build internal AI agents (using frameworks like LangChain, Snowflake Cortex) and mentor other engineers through code reviews, design discussions, and operational best practices. Requirements: • Bachelor’s degree or higher in Computer Science, Information Systems, Software, Electrical or Electronics Engineering. • 5+ years of Data Engineering experience, with deep proficiency in advanced SQL and Python. • 3+ years of hands-on experience with cloud data platforms, specifically Snowflake or Databricks. • Proven experience building and operating scalable ELT pipelines using orchestration tools (e.g., Airflow, dbt). • Strong track record designing production-grade systems (observability, reliability, performance tuning, incident response). • Clean Room Knowledge: Exposure to Data Clean Room concepts and Clean Room platforms like LiveRamp, Snowflake or Databricks. • AI/LLM Experience: Experience building applications with LLMs, RAG, Vector Databases, or frameworks like LangChain/LlamaIndex. • Ability to mentor other engineers through code reviews, design discussions, and operational best practices. Benefits: • medical, dental and vision insurance • 401(k) • paid leave • tuition reimbursement • a variety of other discounts and perks
This job posting was last updated on 3/10/2026