Senior Platform/MLOps Engineer

Jobgether · US

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Platform/MLOps Engineer based in the United States.

This role sits at the core of building the infrastructure that powers next-generation AI-driven manufacturing systems. You will design and scale MLOps and platform capabilities that support computer vision, deep learning, and robotics workloads deployed in real-world factory environments. The work directly enables high-precision automation for defect detection, classification, and visual inspection at industrial scale. You will operate across platform engineering, machine learning infrastructure, and distributed systems, ensuring models move reliably from training to production. This is a hands-on, high-impact engineering role where you will collaborate closely with robotics, AI, and platform teams. The environment is fast-moving, deeply technical, and focused on building resilient systems that perform under demanding production constraints. You will help shape the foundation of a software-defined manufacturing platform.

Accountabilities

You will be responsible for building and evolving the infrastructure that powers scalable ML and platform systems in production environments, with a strong focus on reliability, performance, and developer productivity. You will design and maintain end-to-end MLOps pipelines that support training, deployment, and monitoring of AI models used in computer vision and robotics applications.

Design, implement, and maintain scalable ML/AI infrastructure, including training pipelines, model deployment systems, and inference services
Build and optimize GPU-enabled workloads running in Kubernetes environments for high-performance AI applications
Develop robust CI/CD and GitOps workflows to support continuous delivery of machine learning and platform services
Collaborate with cross-functional teams to define architecture, evaluate technical tradeoffs, and prototype new platform capabilities
Improve system reliability through observability tooling, incident response practices, and performance optimization
Work closely with applied AI and robotics teams to ensure infrastructure meets real-world production needs
Produce high-quality documentation and contribute to engineering best practices across platform teams

Requirements

This role requires strong experience in platform engineering, DevOps, or SRE environments, combined with hands-on exposure to modern MLOps practices and production-grade ML systems. You should be comfortable working across infrastructure, application code, and distributed systems, with a strong focus on scalability and reliability.

5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering
Strong programming skills in Python, Go, JavaScript, C#, or similar languages
Proven experience designing and operating MLOps pipelines in production environments
Deep knowledge of Kubernetes (including CNCF ecosystem, managed and self-hosted environments)
Experience running and optimizing GPU workloads in Kubernetes clusters
Hands-on expertise with Infrastructure as Code tools such as Terraform and configuration management tools like Ansible
Experience with CI/CD pipelines and GitOps-based delivery workflows
Familiarity with observability tools such as Prometheus, Grafana, and OpenTelemetry
Strong understanding of software engineering best practices across the SDLC
Ability to collaborate across teams, translate requirements into system design, and communicate technical concepts clearly
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field

Preferred experience includes working in highly secure or air-gapped environments, mentoring engineers, and contributing to architectural decisions in complex distributed systems.

Benefits

Competitive base salary and performance-based compensation
Comprehensive medical, dental, and vision insurance coverage
Flexible work arrangements depending on role requirements
Opportunity to work on cutting-edge AI, robotics, and industrial automation systems
Career growth in a high-impact, innovation-driven engineering environment
Exposure to large-scale GPU infrastructure and advanced ML systems
Collaborative, cross-functional engineering culture focused on learning and ownership

AI Engineering pay context

Based on 643 disclosed AI Engineering salaries on RoleSuite, the role pays a median of $201K/year, with most offers between $162K and $246K (10th–90th percentile: $130K–$285K).

See the full AI Engineering salary breakdown →

Apply →

Senior Platform/MLOps Engineer

Accountabilities

Requirements

Benefits

AI Engineering pay context

Other roles at Jobgether

More AI Engineering roles