Software Engineering Architect - Platform

Salesforce · San Francisco, California

We are seeking a highly seasoned and expert Software Engineering Architect to lead the design and scaling of one of the world's largest Kubernetes deployments. This critical role involves architecting a robust, secure, and highly reliable container platform that powers thousands of microservices and other services across diverse environments.

The ideal candidate possesses a profound technical understanding of distributed systems, container orchestration, and infrastructure development, coupled with a passion for designing platforms that are easy for other software engineers to build, test, and operate on. You will work on real-world, massive scale problems, collaborate with top-tier engineers, and directly influence the strategic direction of our core container platform across multiple substrates.

Key Responsibilities:

Platform Strategy & Design: Lead the architectural design and evolution of our large-scale, enterprise-grade Kubernetes platform to ensure it meets requirements for scalability, reliability, security, and performance.
Software Development Lifecycle (SDLC) Integration: Define and implement platform tooling and APIs to optimize the SDLC for thousands of microservices, with a focus on application development and deployment pipelines.
Scale and Performance: Architect solutions to handle massive, ever-increasing service and infrastructure scale, ensuring high availability and low latency across the deployment, paying close attention to performance tuning.
Technical Leadership: Act as a subject matter expert and technical leader, guiding platform implementation teams and ensuring alignment with best practices in platform and software engineering.
Microservices Architecture: Define and evangelize resilient software design patterns and best practices for building, deploying, and managing thousands of microservices on the container platform.
Cross-Functional Partnership: Partner closely with infrastructure, security, and application development teams to integrate platform components seamlessly and define clear interfaces for engineering efficiency.
System Reliability: Design systems that are inherently resilient, self-healing, and easy to monitor and troubleshoot, driving down operational complexity for our application engineers.
Build and ship high-quality, production-grade software using modern engineering practices, with AI as a core part of your development workflow by pushing the boundaries of AI development tools to deliver secure, optimized, and high-quality code.
Design and orchestrate complex systems where AI agents integrate seamlessly into human workflows, driving efficiency and innovation at scale.
Contribute to building and maintaining the shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably.
Critically evaluate code (human or AI-generated) for correctness, quality, security, and performance.

Essential Qualifications

Experience: 15+ years of progressive experience in hands-on software engineering and/or platform engineering, with a significant focus on building and scaling complex, high-volume distributed systems.
Deep Kubernetes Expertise:
- Expert-level understanding of Kubernetes internals, architecture, networking, security, and operation at extreme scale.
- Proven experience in designing and scaling Kubernetes deployments supporting thousands of services
Programming Skills: Deep proficiency in Golang (Go) for developing and extending infrastructure systems, APIs, and platform tooling (required for infrastructure development).
Infrastructure Systems: Extensive background in infrastructure development, including cloud environments, networking, storage, and infrastructure-as-code principles.
Microservices: Expert knowledge of microservices architecture, service mesh technologies, API design principles, and inter-service communication patterns.
Security & Reliability: A strong track record of designing platforms that prioritize security, observability (logging, metrics, tracing), and operational reliability for both the platform and the applications it hosts.

Why Join Us?

Real Scale: Work on platform challenges that few organizations ever encounter, powering mission-critical software and services globally.
Influence: Directly shape the future and direction of our core container platform strategy.
Talent: Collaborate daily with some of the industry's most talented and passionate software and platform engineers.
A demonstrated, genuine AI-first approach to engineering — using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.
Experience using AI tools (e.g., Claude Code, GitHub Copilot, Codex, Cursor, etc.) in development workflows.
Advanced prompt engineering skills and the ability to write precise, structured prompts and cultivate the system context that makes AI outputs reliable, secure, and production-ready.

*LI-Y

Apply →