DevOpsJobs
RoleSuite
CompaniesRemoteAboutMethodologyContactPrivacy
Updated 2026-06-23 08:00 UTC·© 2025–2026 RoleSuite
← Back to listings

Site Reliability Consultant

Jobgether · Canada

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Site Reliability Consultant based in Canada.

This role sits at the intersection of cloud infrastructure, software reliability, and large-scale distributed systems engineering. You will be responsible for designing, operating, and continuously improving highly available platforms that support critical workloads across cloud-native environments. The position involves deep hands-on work with Kubernetes, observability tooling, and automation frameworks to ensure systems remain resilient, scalable, and performant. You will collaborate closely with engineering, data, and AI/ML teams to enable reliable infrastructure for complex workloads. This is a highly technical and impact-driven role where your work directly influences system uptime, performance, and engineering efficiency. You will also contribute to incident response, root cause analysis, and long-term reliability improvements across global systems.

Accountabilities:

This role is responsible for ensuring the reliability, scalability, and performance of distributed systems and cloud infrastructure across production environments.

  • Operate, optimize, and troubleshoot Kubernetes clusters, service mesh environments (Istio), and Linux-based systems.
  • Design and implement automation using Go, Python, and Shell scripting to reduce manual operational workload.
  • Build and maintain observability stacks using tools such as Prometheus, Grafana, and Loki for monitoring and alerting.
  • Diagnose and resolve complex issues across networking, storage, compute, and application performance layers.
  • Support AI/ML workloads by ensuring infrastructure readiness for training pipelines and data-intensive processing.
  • Participate in on-call rotations, incident response, and postmortem analysis to improve system reliability.
  • Collaborate with engineering teams to implement infrastructure-as-code practices using Terraform and cloud-native tools.
  • Requirements:

    The ideal candidate has strong Site Reliability Engineering experience with deep expertise in cloud-native infrastructure, automation, and distributed systems.

    • 5+ years of experience in Site Reliability Engineering, DevOps, or infrastructure engineering roles.
    • Strong hands-on experience with Google Cloud Platform and Infrastructure-as-Code tools such as Terraform.
    • Deep understanding of Kubernetes, Docker, microservices architectures, and service mesh concepts.
    • Strong Linux systems administration skills with experience in networking, PKI, and distributed system troubleshooting.
    • Proficiency in scripting and automation using Python, Shell, and ideally Go.
    • Experience building and maintaining observability and monitoring systems in production environments.
    • Strong incident management experience, including root cause analysis and postmortem practices.
    • Solid understanding of scalability, reliability engineering principles, and automation-first thinking.
    • Strong communication and collaboration skills in cross-functional engineering environments.
    • Benefits:

      • Competitive compensation package aligned with market standards (CAD 90,000 – 100,000 per year)
      • Fully remote-friendly work environment with flexibility and autonomy
      • Generous paid time off, including vacation days, sick leave, and volunteer days
      • Annual wellness budget supporting health, fitness, and personal well-being
      • Home office support with equipment and workspace personalization allowance
      • Strong learning and development support, including training, certifications, and professional growth opportunities
      • Collaborative engineering culture working alongside highly skilled global teams
      • Opportunities to work on cutting-edge cloud, AI, and distributed systems infrastructure

DevOps pay context

Based on 1,180 disclosed DevOps salaries on RoleSuite, the role pays a median of $142K/year, with most offers between $115K and $173K (10th–90th percentile: $101K–$210K).

See the full DevOps salary breakdown →
Apply →

Other roles at Jobgether

  • Audio Contributor FrenchCanada
  • Especialista de Marketing e ProjetosBrazil
  • Regional Sales Manager (Com. & Res. Gates)US
  • Local SEO Project Manager for Scalable Agency Work (WordPress + GMB + Citations)Kenya
  • Director, Regulatory AffairsUS
  • Local SEO Project Manager for Scalable Agency Work (WordPress + GMB + Citations)Nigeria
  • Cientista de dadosBrazil
  • Local SEO Project Manager for Scalable Agency Work (WordPress + GMB + Citations)New Zealand
  • Local SEO Project Manager for Scalable Agency Work (WordPress + GMB + Citations)(Remote from Slovenia)Slovenia
  • Head of SMS Lead Generation Strategy (Sending & Data Acquisition)US

More DevOps roles

  • Principal DevOps EngineerBybit · Kuala Lumpur, Malaysia
  • Sr. Software Engineer/SRE - Remote UK (6218)ITD · London, ENG, GB
  • Senior Manager, Site Reliability Engineering Clover Health · Remote - USA
  • Lead Infrastructure EngineerWells Fargo · Hyderabad, India
  • Manager, Infrastructure EngineeringWarner Bros. Discovery · Hyderabad - Phoenix Equinox Tower 2
  • Mainframe Site Reliability Engineer (SRE)Kyndryl · Bangalore, Karnataka, India
  • DevSecOps EngineerKyndryl · CIO KPop-Dallas (US152527)
  • DevSecOps EngineerKyndryl · CIO KPop-Dallas (US152527)
  • Junior DevOps EngineerCaterpillar · Brisbane, Queensland
  • Specialist - Release EngineeringAccelya · India, Pune COE