via LinkedIn
$120K - 200K a year
Design and optimize ML data infrastructure, modernize pipelines, and collaborate with ML teams.
Proficiency in Python, expertise with relational databases, ETL/ELT pipelines, cloud data warehousing, and big data deployments.
Who We Are PathAI's mission is to improve patient outcomes with AI-powered pathology. Our platform promises substantial improvements to the accuracy of diagnosis and the efficacy of treatment of diseases like cancer, leveraging modern approaches in machine learning. Our team, comprising diverse employees with a wide range of backgrounds and experiences, is passionate about solving challenging problems and making a huge impact. We are seeking an experienced contract Back-End Developer with strong database / data warehouse skills to enhance the scalability, performance, and maintainability of our ML data infrastructure. The ideal candidate will bring strong expertise in server side Python development, relational databases, ETL/ELT, and modern big data deployments. You will work closely with our MLOps and ML engineering teams to optimize storage usage, modernize pipelines, deploy new technology, and/or build / enhance tools that support analytics and machine learning workflows. Contract Duration: Minimum 6 months Location: Remote (U.S.) What You’ll Do • Analyze and optimize storage strategies for ML experiment data and metadata. • Design and implement intelligent retention and expiration for large-scale datasets. • Modernize and refactor ETL/ELT pipelines to improve scalability and ease of maintenance. • Create and populate additional schemas for validated and curated datasets. • Build or enhance database-backed applications supporting ML R&D and production analytics. • Collaborate with ML engineers, SREs, and platform teams. • Provide knowledge transfer for long-term maintainers. What You’ll Need • Proficiency in Python for application development, data processing and automation. • Expertise with relational databases (e.g., Postgres, Amazon RDS, Aurora), including schema design, query optimization, and performance tuning. • Expertise with ELT pipelines (dbt preferred) and cloud data warehousing (Snowflake preferred) • Familiarity with big data deployments such as Spark and Hive. • Experience with Apache Airflow for systems automation. • Understanding of S3-based storage and large-scale data management strategies. • Ability to write clear technical documentation and collaborate effectively across teams. • Experience with query optimization, data partitioning strategies, and cost optimization in cloud environments Nice to Have • Background in machine learning data pipelines or analytics-heavy environments. • Knowledge of data governance, retention policies, or cost-optimization strategies in cloud environments. We Want to Hear From You At PathAI, we are looking for individuals who are team players, are willing to do the work no matter how big or small it may be, and who are passionate about everything they do. If this sounds like you, even if you may not match the job description to a tee, we encourage you to apply. You could be exactly what we're looking for. PathAI is an equal opportunity employer, dedicated to creating a workplace that is free of harassment and discrimination. We base our employment decisions on business needs, job requirements, and qualifications — that's all. We do not discriminate based on race, gender, religion, health, personal beliefs, age, family or parental status, or any other status. We don't tolerate any kind of discrimination or bias, and we are looking for teammates who feel the same way. #LI-Remote
This job posting was last updated on 12/15/2025