Worth AI

via Workable

Apply Now

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Principal Data Engineer

Anywhere

full-time

Posted 8/15/2025

Direct Apply

Key Skills:

AWS

SQL

Python

Data architecture

Data modeling

Data quality and governance

Streaming pipelines

Orchestration (Airflow/Dagster/Prefect)

dbt

Kafka/Kinesis

Compensation

Salary Range

$150K - 220K a year

Responsibilities

Design and lead company-wide data architecture and scalable data pipelines, ensure data quality and governance, enable analytics and ML, and provide technical leadership.

Requirements

10+ years data engineering with 3+ years in principal roles, deep AWS and modern warehouse experience, strong SQL and Python/Scala/Java skills, orchestration and streaming expertise, data modeling, security best practices, and leadership.

Full Description

Worth AI, a leader in the computer software industry, is looking for a talented and experienced Principal Data Engineer to join their innovative team. At Worth AI, we are on a mission to revolutionize decision-making with the power of artificial intelligence while fostering an environment of collaboration, and adaptability, aiming to make a meaningful impact in the tech landscape.. Our team values include extreme ownership, one team and creating reaving fans both for our employees and customers. Worth is looking for a Principal Data Engineer to own the company-wide data architecture and platform. Design and scale reliable batch/streaming pipelines, institute data quality and governance, and enable analytics/ML with secure, cost-efficient systems. Partner with engineering, product, analytics, and security to turn business needs into durable data products. Responsibilities What you will do: Architecture & Strategy Define end-to-end data architecture (lake/lakehouse/warehouse, batch/streaming, CDC, metadata). Set standards for schemas, contracts, orchestration, storage layers, and semantic/metrics models. Publish roadmaps, ADRs/RFCs, and “north star” target states; guide build vs. buy decisions. Platform & Pipelines Design and build scalable, observable ELT/ETL and event pipelines. Establish ingestion patterns (CDC, file, API, message bus) and schema-evolution policies. Provide self-service tooling for analysts/scientists (dbt, notebooks, catalogs, feature stores). Ensure workflow reliability (idempotency, retries, backfills, SLAs). Data Quality & Governance Define dataset SLAs/SLOs, freshness, lineage, and data certification tiers. Enforce contracts and validation tests; deploy anomaly detection and incident runbooks. Partner with governance on cataloging, PII handling, retention, and access policies. Reliability, Performance & Cost Lead capacity planning, partitioning/clustering, and query optimization. Introduce SRE-style practices for data (error budgets, postmortems). Drive FinOps for storage/compute; monitor and reduce cost per TB/query/job. Security & Compliance Implement encryption, tokenization, and row/column-level security; manage secrets and audits. Align with SOC 2 and privacy regulations (e.g., GDPR/CCPA; HIPAA if applicable). ML & Analytics Enablement Deliver versioned, documented datasets/features for BI and ML. Operationalize training/serving data flows, drift signals, and feature-store governance. Build and maintain the semantic layer and metrics consistency for experimentation/BI. Leadership & Collaboration Provide technical leadership across squads; mentor senior/staff engineers. Run design reviews and drive consensus on complex trade-offs. Translate business goals into data products with product/analytics leaders. 10+ years in data engineering (including 3+ years as staff/principal or equivalent scope). Proven leadership of company-wide data architecture and platform initiatives. Deep experience with at least one cloud (AWS) and a modern warehouse or lakehouse (e.g., Snowflake, Redshift, Databricks). Strong SQL and one programming language (Python or Scala/Java). Orchestration (Airflow/Dagster/Prefect), transformations (dbt or equivalent), and streaming (Kafka/Kinesis/PubSub). Data modeling (3NF, star, data vault) and semantic/metrics layers. Data quality testing, lineage, and observability in production environments. Security best practices: RBAC/ABAC, encryption, key management, auditability. Nice to Have Feature stores and ML data ops; experimentation frameworks. Cost optimization at scale; multi-tenant architectures. Governance tools (DataHub/Collibra/Alation), OpenLineage, and testing frameworks (Great Expectations/Deequ). Compliance exposure (SOC 2, GDPR/CCPA; HIPAA/PCI where relevant). Health Care Plan (Medical, Dental & Vision) Retirement Plan (401k, IRA) Life Insurance Unlimited Paid Time Off 9 paid Holidays Family Leave Work From Home Free Food & Snacks (Access to Industrious Co-working Membership!) Wellness Resources

Apply Now

This job posting was last updated on 8/18/2025

JobLogr is the job search platform that gets you hired faster.

Save $15kIN LOST INCOME

Get back 54 hrsOF YOUR LIFE

Hired 3.5x fasterTHAN YOU WOULD

Estimates based on average job search habits

Try for Free

No credit card required.

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »