Find your dream job faster with JobLogr
AI-powered job search, resume help, and more.
Try for Free
CL

Cystems Logic Inc

via Smartrecruiters

Apply Now
All our jobs are verified from trusted employers and sources. We connect to legitimate platforms only.

Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote

Anywhere
contractor
Posted 7/29/2025
Direct Apply
Key Skills:
Infrastructure Engineering
AI Inference Pipelines
CUDA
NCCL
ROCm
Kubernetes
Python
C++
Linux Systems Programming
MLOps
Telemetry Systems
Driver Orchestration
Container Security
Benchmarking
Hardware Abstraction
Cloud-native Inference

Compensation

Salary Range

$Not specified

Responsibilities

Design and implement cross-platform hardware detection systems for various accelerators. Collaborate with teams to ensure hardware-aware agent deployment across cloud providers.

Requirements

Candidates should have 4-7 years of experience in systems software or infrastructure engineering, with a focus on AI/ML workloads. Deep expertise in accelerator programming frameworks and strong programming skills in Python and C++ are essential.

Full Description

Job DescriptionHello, Infrastructure Engineer - Software Engineer – Infrastructure & Hardware Optimization - Remote  We have below job opening.If you are interested and your experience match with job description.Please send your updated resume....Asap Software Engineer – Infrastructure & Hardware OptimizationLocation: SF, CA, Portland, OR, Dallas, TX - Remote but need to be local of respective locationDuration: 6 Months+ Contract  Job Description: We are seeking a skilled low-level systems engineer to join the team. This individual will focus on infrastructure software that detects, configures, and optimizes AI inference pipelines across heterogeneous hardware accelerators (e.g., NVIDIA / AMD GPUs, TPUs, AWS Inferentia, FPGAs). You will work on hardware abstraction layers, containerized runtime environments, benchmarking, telemetry, and driver orchestration logic for multi-cloud agentic inference deployments. Ideal Experience: · 4–7 years experience in systems software or infrastructure engineering, preferably with exposure to AI/ML workloads. · Deep expertise in CUDA, NCCL, ROCm, or other accelerator programming frameworks. · Familiarity with LLM inference runtimes (TensorRT-LLM, vLLM, ONNXRuntime). · Experience with Kubernetes scheduling, device plugin development, and runtime patching for heterogeneous compute. · Strong Python/C++ and Linux systems programming skills. · Passion for building scalable, portable, and secure AI infrastructure. Responsibilities: · Design and implement cross-platform hardware detection systems for GPUs/TPUs/NPUs using CUDA, ROCm, and low-level runtime interfaces. · Build and maintain plugin-based infrastructure for capability scoring, power efficiency tuning, and memory optimization. · Develop hardware abstraction layers (HAL) and performance benchmarking tools to optimize AI agents for cloud-native inference. · Extend container-based MLOps systems (Docker/Kubernetes) with support for hardware-specific runtime containers (e.g., TensorRT, vLLM, ROCm). · Automate driver validation, container security hardening, and runtime health monitoring across deployments. · Integrate telemetry systems (Prometheus, Grafana) to surface per-device inference performance metrics and health status. · Collaborate with solutions and DevOps teams to ensure hardware-aware agent deployment across cloud providers.Additional InformationAll your information will be kept confidential according to EEO guidelines.

This job posting was last updated on 7/30/2025

Ready to have AI work for you in your job search?

Sign-up for free and start using JobLogr today!

Get Started »
JobLogr badgeTinyLaunch BadgeJobLogr - AI Job Search Tools to Land Your Next Job Faster than Ever | Product Hunt