via Workable
$NaNK - NaNK a year
Design and develop cloud-native data platforms and pipelines, ensuring data quality, governance, and performance optimization.
Extensive experience with Spark, Delta Lake, Databricks, and data architecture, along with knowledge of data governance and orchestration tools.
Role OverviewWe are seeking a Data Bricks Data Architect to support the design, implementation, and optimization of cloud-native data platforms built on the Data bricks Lakehouse Architecture. This is a hands-on, engineering-driven role requiring deep experience with Apache Spark, Delta Lake, and scalable data pipeline development, combined with early-stage architectural responsibilities.The role involves close onsite collaboration with client stakeholders, translating analytical and operational requirements into robust, high-performance data architectures, while adhering to best practices for data modeling, governance, reliability, and cost efficiency. Key Responsibilities· Design, develop, and maintain batch and near-real-time data pipelines using Databricks, PySpark, and Spark SQL· Implement Medallion (Bronze/Silver/Gold) Lakehouse architectures, ensuring proper data quality, lineage, and transformation logic across layers· Build and manage Delta Lake tables, including schema evolution, ACID transactions, time travel, and optimized data layouts· Apply performance optimization techniques such as partitioning strategies, Z-Ordering, caching, broadcast joins, and Spark execution tuning· Support dimensional and analytical data modeling for downstream consumption by BI tools and analytics applications· Assist in defining data ingestion patterns (batch, incremental loads, CDC, and streaming where applicable)· Troubleshoot and resolve pipeline failures, data quality issues, and Spark job performance bottlenecks. Nice-to-Have Skills· Exposure to Data bricks Unity Catalog, data governance, and access control models· Experience with Data bricks Workflows, Apache Airflow, or Azure Data Factory for orchestration· Familiarity with streaming frameworks (Spark Structured Streaming, Kafka) and/or CDC patterns· Understanding of data quality frameworks, validation checks, and observability concepts· Experience integrating Data bricks with BI tools such as Power BI, Tableau, or Looker· Awareness of cost optimization strategies in cloud-based data platforms· Prior Lifesciences Domain Experience
This job posting was last updated on 1/8/2026