$120K - 160K a year
Design, implement, and maintain scalable AI/ML cloud infrastructure and CI/CD pipelines for generative AI initiatives ensuring high availability, security, and performance.
4+ years in MLOps or infrastructure engineering, advanced AWS experience, Linux proficiency, container orchestration expertise, CI/CD pipeline development, Terraform skills, and deep knowledge of ML frameworks.
MissionStaff specializes in delivering Creative, Marketing, and Technology talent for companies ranging from mid-sized enterprises to the Fortune 500. We build lasting relationships with talent and clients to power career opportunities and business. We are currently filling the following contract position with our client. AI/ML Operations Engineer The Role: As an AI/MLOps Engineer, you will lead the design, implementation, and evolution of cloud infrastructure for generative AI initiatives. This role requires expertise in DevOps, cloud architecture, large language models (LLMs), automation, and monitoring. You’ll be part of a growing technical team responsible for ensuring high availability, security, and performance of AI applications and infrastructure. Key Responsibilities: • Define governance, standards, and deployment strategies for LLM infrastructure on managed AWS platforms (batch and real-time inference). • Design and maintain CI/CD pipelines using tools like GitLab to support model and infrastructure deployments. • Build and manage scalable, resilient infrastructure using container orchestration platforms such as Kubernetes, EKS, or ECS. • Manage source code using Git-based platforms, implementing structured version control. • Automate model retraining, versioning, and deployment workflows. • Troubleshoot and resolve issues across testing, security, and deployment pipelines. • Develop robust observability solutions including logging, monitoring, metrics, and alerts. • Implement cost-effective scaling strategies for LLM inference, including model sharing and efficient compute utilization. • Create automated test frameworks to ensure consistent model performance through updates. • Collaborate with cross-functional teams and mentor junior engineering and operations staff. Qualifications: • 4+ years of hands-on experience in MLOps or infrastructure engineering roles. • Advanced experience implementing AWS services across multi-account and multi-region environments; AWS certifications preferred. • Strong background in Linux (Ubuntu, Amazon Linux, etc.). • Proven experience with containerized applications and orchestration frameworks in production settings. • Expertise in building and maintaining CI/CD pipelines using tools such as GitLab or Jenkins. • Proficient with Infrastructure as Code tools, especially Terraform. • Deep knowledge of machine learning frameworks such as PyTorch and TensorFlow. • Experience building monitoring and automation systems using CloudWatch, Prometheus, Grafana, etc. • Familiarity with LLM-specific tools like TensorBoard, MLFlow, and emerging frameworks such as Model Context Protocol (MCP) or Agent-to-Agent Protocol is a strong plus. • Understanding of performance optimization techniques for inference speed, throughput, and cost efficiency. MissionStaff is an equal opportunity employer.
This job posting was last updated on 6/26/2025