Senior Engineering Manager, Compute

Temporal · United States

Senior Engineering Manager, Compute

Companies at the frontier of the AI revolution run on Temporal. OpenAI runs on Temporal, handling millions of requests. Cursor runs its cloud coding agents on Temporal at over 50 million actions a day across 7M+ workflows, and more than a third of the pull requests its users merge now come from those agents. Replit, Lovable, Abridge, and Hebbia build their agents on it too. In the last year alone, AI-native companies executed 1.86 trillion actions on Temporal Cloud, and the curve is still bending upwards. Backed by a recent $300M Series D at a $5B valuation, we are building the durable execution layer the agentic era depends on.

The Compute team owns the layer all of that runs on. We are looking for a Senior Software Development Manager to lead the effort to make any aspects of Temporal's compute invisible to our customers, allowing them to focus on application layer innovation, while we handle the compute muck. This is a rare, build-the-foundation mandate: the compute substrate that the world's most demanding AI workloads will run on. We want a leader who has operated compute at planet scale, thinks in fleets, goodput, and cost-per-unit-of-compute, and pairs that with the operational rigor to run a service that frontier-AI companies bet production on.

Responsibilities

Strategic direction for Compute: Own the strategy and standards of excellence for the compute layer that the world's agents run on, across design, delivery, and operations. Build a culture of ownership, quality, and customer-first decision-making.
Technical leadership: Lead, hire, and grow a high-ownership team; roll up sleeves, ready to do deep into the trenches, by staying close to design docs and code, rather than managing from a distance. Coach engineers, level them up, and clear the friction that slows them down.
Roadmap & trajectory: Drive the arc from today's compute toward the next-generation of compute platforms. Ground prioritization in customer and design-partner feedback, and turn ambiguous, fast-moving requirements into predictable, iterative delivery.
Operational excellence: When you run frontier AI in production, reliability is the product. Own operations, run on-call and incident response, and drive blameless postmortems and the systemic fixes that prevent recurrence.
Technical depth: Guide the hard architectural decisions for large-scale, multi-tenant compute, where technical concerns cut across workload isolation and security, scheduling, fleet efficiency / utilization / goodput, and performance, while ensuring the platform is reliable and efficient for the workloads that depend on it.
Capacity, supply & economics: Own utilization, capacity and supply planning, and the cost-per-unit-of-compute and margin profile of the fleet, across CPU compute today and accelerated compute ahead.
Cross-team & customer execution: Partner with leadership, Product, SDK, UX/DX, Security, and design-partner customers to align priorities and unblock delivery. Communicate progress, tradeoffs, and risk clearly to technical and non-technical audiences alike.

Qualifications

Proven experience leading software engineering teams that build and operate large-scale compute platforms or fleets, with strong operational practices.
12+ years in software and/or infrastructure engineering, including 7+ years of people management and demonstrated ownership of delivery and live-site outcomes.
Deep distributed-systems and compute infrastructure depth, with the hands-on judgment to guide architecture and execution rather than from a distance.
Experience operating multi-tenant compute that other people's production workloads depend on.
Bachelor's degree in Computer Science or related field, or equivalent practical experience; advanced degree a plus.
Excellent communication skills, with the ability to partner across engineering, product, and leadership and fold customer feedback into the roadmap.

Required Skills

Strong leadership, coaching, and performance management; ability to grow engineers and build a healthy, accountable, high-ownership team.
Excellence in execution: planning, prioritization, and delivering iterative milestones in an ambiguous, fast-moving environment while managing unplanned work.
Fleet thinking: utilization, goodput, capacity and supply planning, and cost discipline as first-class engineering concerns.
Live-site reliability craft: on-call, incident management & response, and postmortem-driven continuous improvement.
Strong command of the building blocks of a compute platform: multi-tenant isolation and security, scheduling, and resource management.
Ability to review and raise the bar on technical artifacts (design docs, code reviews) across a distributed-systems codebase.

Preferred Experience

MicroVMs and virtualization (Firecracker, gVisor, Edera) or managed-compute primitives (AWS Fargate, GCP Cloud Run, AWS Lambda), and/or Kubernetes internals.
Building serverless or hosted-compute products from 0 to 1, including the rapid-delivery-vs-durable-platform tradeoffs that come with it.
Multi-cloud delivery across AWS and GCP.
Cold-start, warm-pool, and scheduling/latency optimization for on-demand compute.
Agent sandboxes, secure execution of untrusted code, or other AI-agent infrastructure.
GPU / accelerated compute: fractional GPUs (MIG, MPS, time-slicing), GPU scheduling, training vs. inference fleets, and multi-tenant GPU isolation.

---

The ideal candidate is a strategic thinker with a hands-on approach, energized by building the compute foundation the AI era runs on. They are comfortable shaping a space that doesn't fully exist yet, obsessed with reliability when customers bet production on it, default to working backwards from customers, and balance the speed to ship 0-to-1 against the durable design a planet-scale platform demands.

Compensation

The estimated pay range for this role is $320,000 - $335,000.

This role is eligible to participate in Temporal's equity plan

Eng Management pay context

Based on 727 disclosed Eng Management salaries on RoleSuite, the role pays a median of $216K/year, with most offers between $178K and $254K (10th–90th percentile: $157K–$314K).

See the full Eng Management salary breakdown →

Apply →

Senior Engineering Manager, Compute