DataJobs
RoleSuite
CompaniesRemoteAboutMethodologyContactPrivacy
Updated 2026-06-15 12:00 UTC·© 2025–2026 RoleSuite
← Back to listings

Staff Machine Learning Systems Engineer (MLOps)

Jobgether · US

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Staff Machine Learning Systems Engineer (MLOps) based in the United States.

This is a high-impact infrastructure role focused on building and operating the production systems that power large-scale AI and ML services. You will define how machine learning workloads are deployed, observed, secured, and scaled across cloud-native environments. The role sits at the intersection of platform engineering, DevOps, and applied AI, ensuring that every AI product can be shipped safely and reliably. You will design the underlying Kubernetes-based infrastructure, CI/CD pipelines, and model-serving systems that support mission-critical workloads. Working closely with ML engineers, product teams, and security stakeholders, you will help translate experimental AI capabilities into production-grade systems. This is a hands-on senior technical role for someone who thrives in complex, high-scale, and fast-evolving environments.

Accountabilities:

Lead the design, evolution, and operation of the core ML infrastructure platform supporting AI workloads across production systems, ensuring scalability, reliability, and security across environments.

  • Own and optimize Kubernetes-based infrastructure (e.g., EKS), including autoscaling, workload orchestration, and cluster lifecycle management for ML and AI systems
  • Build and maintain GitOps-based CI/CD pipelines enabling safe, repeatable, and efficient deployment of AI services across environments
  • Design and implement model serving and inference infrastructure, including LLM routing, API gateways, and multi-provider integrations
  • Develop observability, tracing, and monitoring systems for AI workloads using tools such as OpenTelemetry, Datadog, and LLM tracing platforms
  • Define and enforce SLOs, incident response processes, and reliability standards for ML systems in production
  • Own infrastructure-as-code and platform tooling (Terraform, CLIs, internal frameworks) to improve developer velocity and consistency
  • Drive security, IAM, and secrets management architecture ensuring compliance, least-privilege access, and data protection standards
  • Collaborate with ML, product, and data teams to translate research and prototypes into production-ready systems
  • Identify platform bottlenecks and lead initiatives to improve performance, cost efficiency, and deployment speed
  • Provide technical leadership, mentorship, and architectural guidance across ML systems engineering initiatives
  • Requirements:

    This role requires deep expertise in cloud infrastructure, ML systems, and production-grade platform engineering, with a strong focus on reliability, scalability, and security.

    • 8+ years of experience in platform engineering, DevOps, SRE, or infrastructure roles, including hands-on ML/AI systems experience
    • Strong expertise with Kubernetes (preferably EKS), including cluster operations, autoscaling, and workload orchestration
    • Proficiency in infrastructure-as-code tools such as Terraform and experience designing secure cloud architectures
    • Solid programming skills in Python with experience building infrastructure tooling and automation systems
    • Experience operating LLM or ML inference systems in production, including routing, serving, and observability
    • Hands-on experience with observability stacks (Datadog, OpenTelemetry, logging/tracing systems, or equivalents)
    • Strong understanding of CI/CD systems, GitOps workflows, and developer platform engineering
    • Experience designing IAM, OIDC, and secrets management systems in cloud environments
    • Systems-thinking mindset with strong attention to failure modes, reliability, and long-term maintainability
    • Ability to collaborate across engineering, ML, security, and product teams in fast-paced environments
    • Experience in regulated or high-compliance environments (healthcare, fintech, or similar) is a plus
    • Benefits:

      • Competitive salary with equity opportunities
      • Comprehensive health coverage including medical, dental, and vision
      • Unlimited PTO, company holidays, and mental health days
      • Parental leave and family support benefits
      • 401(k) with employer matching
      • Employee stock purchase program (ESPP)
      • Remote-first flexibility and offsite team gatherings
      • Strong emphasis on wellness, learning, and professional development.

Data & ML pay context

Based on 1,298 disclosed Data & ML salaries on RoleSuite, the role pays a median of $165K/year, with most offers between $128K and $209K (10th–90th percentile: $107K–$246K).

See the full Data & ML salary breakdown →
Apply →

Other roles at Jobgether

  • Senior Instructional DesignerPortugal
  • Senior Instructional DesignerItaly
  • Senior Instructional DesignerRomania
  • Senior Instructional DesignerMexico
  • Senior Instructional DesignerSouth Africa
  • Senior Instructional DesignerAustralia
  • Senior Instructional DesignerTurkey
  • Senior Instructional DesignerUnited Arab Emirates
  • Senior Instructional DesignerBelgium
  • Senior Instructional DesignerSaudi Arabia

More Data & ML roles

  • Sr Principal Data ScientistSmartsheet · Bangalore, INDIA
  • Analytics EngineerDashlane · Lisbon, Portugal
  • Senior Data EngineerGoogle · Mountain View, CA, USA
  • Lead Data EngineerWells Fargo · Hyderabad, India
  • Senior Risk Analytics ConsultantWells Fargo · Bengaluru, India
  • Senior Data Engineer- Spark, Abinitio, Python, SQL, Data warehouseWells Fargo · Bengaluru, India
  • Sr. Data Governance Platform EngineerU.S. Bank · Cupertino, CA
  • IN_Senior Associate_Data Privacy_RC - GRC AITH_Advisory_NoidaPwC · Noida
  • IN_Senior Associate_Data Privacy_RC - GRC AITH_Advisory_NoidaPwC · Noida
  • IN_Senior Associate_Data Privacy_RC - GRC AITH_Advisory_NoidaPwC · Noida