Lead Platform Reliability Engineer
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Lead Platform Reliability Engineer based in Canada.
In this role, you will act as a senior technical authority within a global platform services organization, shaping the reliability, scalability, and operational excellence of enterprise-grade cloud infrastructure. You will lead the evolution of Azure-based platforms, including Kubernetes (AKS), PaaS services, and CI/CD ecosystems, ensuring high availability and performance across critical business systems. This position combines deep hands-on engineering expertise with strategic platform leadership, influencing architectural decisions, governance models, and engineering standards. You will collaborate with cross-functional teams, vendors, and stakeholders to deliver resilient, secure, and cost-efficient solutions. A key part of your mission will be driving continuous improvement through observability, automation, and DevOps best practices. You will also play a critical role in mentoring engineers and elevating technical maturity across the organization. This is a high-impact role where platform reliability directly supports business value at scale.
Accountabilities:
You will be responsible for ensuring the stability, scalability, and continuous evolution of enterprise cloud platforms, with a strong focus on Azure ecosystem reliability and engineering excellence. This includes ownership of platform architecture, operational performance, and lifecycle management of critical cloud services.
- Lead the design, operation, and optimization of Azure-based platforms, including AKS, PaaS services, and CI/CD pipelines.
- Define and enforce platform standards, governance models, security practices, and reliability patterns.
- Manage platform roadmaps, upgrades, and modernization initiatives in collaboration with stakeholders and vendors.
- Oversee incident management and drive end-to-end resolution of complex platform issues in a multi-vendor environment.
- Ensure 24/7 service reliability and compliance with SLA commitments across production systems.
- Implement and improve observability practices, including monitoring, logging, and performance analytics.
- Drive cost optimization initiatives across Azure subscriptions and cloud resource usage.
- Mentor engineers and contribute to building technical capabilities within the platform organization.
- Support delivery of large-scale infrastructure and transformation projects.
- Collaborate with product, engineering, and operations teams to align platform capabilities with business needs.
- 10+ years of experience in software engineering, platform engineering, or infrastructure roles.
- 5+ years of hands-on experience with Microsoft Azure, including AKS and core PaaS services.
- Strong expertise in Kubernetes, containerization (Docker), and cloud-native architectures.
- Solid experience with CI/CD pipelines and DevOps tooling (Terraform, GitHub Actions, Jenkins, Helm, Flux).
- Deep understanding of Azure services such as Key Vault, Functions, Databricks, Synapse, Redis, and Azure Monitor.
- Experience with microservices, distributed systems, and performance/capacity management.
- Familiarity with messaging, streaming, and event-driven architectures.
- Strong knowledge of observability, incident response, and SRE principles.
- Ability to manage technical stakeholders and influence architectural decisions.
- Experience with financial or capacity planning for cloud environments is highly desirable.
- Strong communication skills and proven leadership in cross-functional environments.
- Degree in Computer Science or related field (or equivalent experience).
- Competitive salary aligned with North American market standards
- Annual performance-based bonus and incentive programs
- Comprehensive health, dental, and vision insurance coverage
- Mental health and wellness support programs
- Retirement savings plans with employer contributions
- Paid vacation, personal days, and statutory leave entitlements
- Hybrid work model (3 days in-office, 2 days remote)
- Learning and development programs, including certifications and training support
- Inclusive and flexible work environment focused on well-being and growth
- Career progression opportunities in a global technology organization.
Requirements:
You bring extensive experience in cloud platform engineering, with deep expertise in Azure infrastructure, DevOps practices, and large-scale distributed systems. You are a strong technical leader capable of influencing architecture while remaining hands-on in complex environments.
Benefits:
Software pay context
Based on 7,510 disclosed Software salaries on RoleSuite, the role pays a median of $158K/year, with most offers between $123K and $200K (10th–90th percentile: $101K–$236K).
See the full Software salary breakdown →