Pivots Global

via LinkedIn

All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

ML Infrastructure Engineer / Founding ML Lead

Anywhere

Full-time

Posted 3/5/2026

Verified Source

Key Skills:

Windows Server

Active Directory/Entra ID

PowerShell/Bash automation

Compensation

Salary Range

$Not specified

Responsibilities

Lead Tier 2/3 escalation and manage hybrid Microsoft 365 and Windows Server infrastructure with automation and security hardening.

Requirements

Extensive experience in IT infrastructure engineering, escalation leadership, automation, and security in hybrid environments.

Full Description

• *About the Role** This is a once-in-a-career opportunity to architect an AI system from first principles and grow into the CTO role as we scale. You'll define our entire ML infrastructure—from model research to production deployment—with full technical ownership and minimal oversight. We're building at the intersection of cutting-edge AI and blockchain, solving ambiguous problems that don't have textbook answers. You'll work directly with the CEO to translate research breakthroughs into production systems, balancing intellectual rigor with startup pragmatism. This role demands someone who thrives in early-stage chaos, moves fluidly between PyTorch experiments and cloud infrastructure, and is energized by the prospect of building both the technology and the team from zero. • *Our Technical Stack** - **ML Frameworks**: Python · PyTorch/TensorFlow · Hugging Face · LangChain · OpenAI API—cutting-edge AI frameworks with focus on LLMs and multimodal models - **Infrastructure**: AWS/GCP/Azure · Kubernetes · Docker · Terraform—cloud-native, containerized ML workflows with infrastructure-as-code - **MLOps**: MLflow · Weights & Biases · SageMaker—experiment tracking, model versioning, and automated deployment pipelines - **Early-stage reality**: You'll build the foundation—selecting tools, defining architecture, and setting technical standards from day zero with minimal legacy constraints • *What You'll Do** - Architect and implement our core ML infrastructure end-to-end: data pipelines, training workflows, model serving, and observability systems across AWS/GCP/Azure - Drive model research and development for LLMs and multimodal architectures, translating cutting-edge papers into working prototypes and evaluating tradeoffs between approaches - Define technical roadmap and engineering standards in partnership with the CEO, balancing innovation velocity with system reliability as we iterate toward product-market fit - Own infrastructure decisions with full autonomy—design distributed training systems, select MLOps tooling, and establish deployment patterns that will scale from 10 to 10M users - Mentor and grow the engineering team as we hire, establishing a culture of technical excellence, analytical rigor, and bias-to-action that reflects your values - Propose novel solutions to ambiguous product challenges, challenging assumptions about what's possible and prototyping approaches that competitors haven't considered - Transition into CTO as the company scales, evolving from hands-on builder to technical leader who sets org-wide direction and represents our technical vision externally • *What We're Looking For** - PhD in Computer Science, Machine Learning, Statistics, or related field—OR 5+ years building and deploying production ML systems with demonstrable equivalent depth - Deep expertise in modern ML frameworks (PyTorch or TensorFlow) with experience training and optimizing large-scale models, including LLMs or multimodal architectures—you've debugged gradient explosions and OOM errors in production, not just run tutorials - Strong Python engineering skills with focus on production-grade ML systems—comfortable moving between research prototypes and scalable inference infrastructure - Hands-on experience with cloud platforms (AWS, GCP, or Azure) including MLOps tooling for experiment tracking, model versioning, and deployment pipelines—you've owned the full lifecycle from training to serving - Track record of independent problem-solving in ambiguous environments—ability to define technical direction when requirements are unclear and resources are constrained - Proven ability to ship working AI systems end-to-end—not just research papers, but products that real users interact with at scale - 3-7 years post-PhD (or 5-10 years for non-PhDs) building ML infrastructure or leading ML engineering initiatives in production environments • *Nice to Have** - Experience architecting ML platforms from scratch—defining training pipelines, serving infrastructure, and monitoring systems for model observability - Background in both research and engineering—published papers AND deployed systems, comfortable translating cutting-edge research into production-ready solutions - Prior founding team or early-stage startup experience—thrived in high-autonomy, resource-constrained environments where you owned multiple domains from model development to infrastructure

This job posting was last updated on 3/6/2026

JobLogr gets you hired faster

Save $15k

in lost income

Get back 54 hrs + hired 3.5x faster

than average job search

Try for Free

No credit card required

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »