via Gem
$145K - 201K a year
Design and optimize scalable data pipelines for large-scale genomic and clinical data, support internal and external users, and manage data infrastructure on AWS.
Bachelor's/Master's degree with 5+ years experience in genomics data engineering, deep domain knowledge in molecular biology/genomics, expert Python skills, and experience with AWS distributed systems and bioinformatics tools.
Who We Are & What We Do At Helix, our mission is simple: to help everyone improve their lives through their DNA. Ready to make a real-world impact with your skills? At Helix, we're transforming healthcare by making genomics a standard of care. We partner with health systems and life science companies to accelerate the integration of genomic data into clinical practice. Join us in building a future where healthcare is personalized, proactive, and powered by genomics. What is special about this role? The Helix Clinicogenomic Data Engineering team plays a pivotal role in Helix’s efforts to provide a best-in-class clinico-genomics research dataset for our partners to conduct and publish cutting-edge genomic studies and operational insights back to health systems. Working closely with our Research, Bioinformatics, and Engineering teams, we are responsible for a RUO-genomics dataset as well as developing various programmatic interfaces to facilitate easy access to enable our scientists and analysts. The patient is top of mind in everything we do, and your contributions here have the opportunity to improve the real world outcomes for everyone. As a Senior Genomics Data Engineer, you will: Leverage your specialized knowledge to develop innovative solutions that simplify complex genomic data, effectively lowering the barrier to entry for non-expert users. Design, build, and continuously optimize robust, scalable, and automated data pipelines for processing large-scale genomic and clinical data. Build and maintain critical pipelines to prepare, de-identify, and securely deliver massive-scale genetic datasets to both internal research teams and external partners. Work cross-functionally with world-class Engineering, Research, AI, Data Science, Bioinformatics, Product, and Commercial teams to tackle complex data challenges and drive scientific discovery. Provide expert-level support and create tooling to help internal and external data consumers effectively utilize our complex datasets and platforms. Implement and manage data infrastructure as code using tools like AWS CDK, ensuring our distributed compute environment is efficient and scalable. About you: Bachelor's/Master's degree in Computer Science, Bioinformatics, Engineering or a related field with 5+ years of experience Deep domain knowledge in molecular biology, next-generation sequencing, or genomics Demonstrated experience in processing a variety of large scale genetic data formats (exome/whole genome), including but not limited to VCF, CRAM, BAM, and PLINK. Strong experience using industry-standard bioinformatics tools such as bcftools, htslib, and samtools. Experience with genomic data-reduction techniques, such as PCA Expert-level proficiency in Python Proven experience designing and building distributed systems on AWS, including expertise with services like Glue, EMR, S3, Lambda, and DynamoDB. Proficiency with infrastructure-as-code frameworks (e.g., AWS CDK, Terraform). Expertise with ETL pipeline automation and workflow management tools such as Airflow, AWS Glue, AWS Step Functions, and CI/CD Familiarity with database design, data manipulation, and data quality techniques Demonstrated ability to thrive in a fast-paced, adaptable environment. Pluses: Background in the bioinformatics or healthcare industries, and familiarity with clinical data. Proficiency in Go, Java, C, C++, or Scala Hands-on skills with genomics-specific data tools such as Hail or TileDB Track records of working in a regulated data environment (e.g., HIPAA) Hands-on experience designing and building distributed systems on AWS, frameworks such as Spark, Dask, EMR, Databricks, or similar Expected Interview Process: 1) Recruiter Screen 2) Manager Screen/Tech Screen 3) Onsite 4) Offer Expected Pay For This Role: There are 3 distinct parts to your Helix offer: 1) Base Salary 2) Annual Bonus 3) Equity Expected Helix Base: $145,000 - $201,250 Expected Helix Discretionary Annual Bonus: 10% of your annual salary Equity: We offer generous equity at Helix. If you receive a Helix offer your recruiter will book dedicated time with you to educate you on our equity model. Aside from working alongside brilliant, dedicated, passionate, down-to-earth, curious, warm, and thoughtful people, we also provide great benefits: Comprehensive Health Insurance with Date of Hire eligibility Above average employer paid premium coverage 12 weeks Helix Paid Parental Leave option 401(k) with employer matching of up to 3% and 100% Vesting on the Date of Hire Comprehensive Well-Being Benefits Flexible PTO Remote options for many roles and a home office stipend What To Expect During Your First 90 days: First 30 days: you’ll spend time learning the Helix way, completing training and onboarding for your roles, and getting introduced to your team and relevant stakeholders. You’ll also gain a deeper understanding of our customers, our products, the impact we make in the lives of our communities, and how to thrive at Helix through participation in Helix U. Day 30 - 60: you’ll spend time contributing to projects, deeply familiarizing yourself with team and company processes, and developing a deeper understanding of Helix’s products, services and capabilities. Day 60 - 90: you’ll build your OKRs with your manager, start to take ownership of projects and initiatives on your team, and begin to demonstrate your impact on the Helix mission. Helix is proud to be an equal opportunity employer, and committed to providing employment opportunities regardless of race, religious creed, color, national origin, ancestry, physical disability, mental disability, medical condition, genetic information, marital status, sex, gender, gender identity, gender expression, pregnancy, childbirth and breastfeeding, age, sexual orientation, military or veteran status, or any other protected classification, in accordance with applicable federal, state, and local laws. #LI-Remote
This job posting was last updated on 11/26/2025