Senior Site Reliability Engineer

Jobgether · Brazil

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Site Reliability Engineer based in Brazil.

This role sits at the core of large-scale, internet-facing systems where performance, reliability, and security are mission-critical. You will help design and operate infrastructure that serves millions of users and handles massive request volumes across distributed environments. The position combines deep systems engineering with hands-on software development, focusing on building resilient, observable, and cost-efficient platforms. You will work across cloud-native stacks, collaborating with engineering teams to improve uptime, deployment velocity, and system robustness. The environment is fast-paced, highly technical, and driven by continuous iteration and data-based decision-making. This is a role for engineers who enjoy solving complex distributed systems challenges at scale while directly impacting end-user experience.

Accountabilities:

Design, build, and operate large-scale distributed systems supporting high-throughput, low-latency services across multi-cloud environments.
Improve system reliability, scalability, performance, and cost efficiency across infrastructure, applications, and networking layers.
Develop and maintain Kubernetes-based infrastructure and automation to support engineering teams and production workloads.
Implement and enhance observability solutions, including monitoring, logging, tracing, and alerting across systems and services.
Work closely with CI/CD pipelines and deployment systems to ensure safe, fast, and reliable software delivery.
Diagnose and resolve complex production issues spanning infrastructure, networking, and application layers.
Contribute to architectural decisions involving distributed systems, security, and high-availability design.

Requirements

6+ years of experience in Site Reliability Engineering, DevOps, or backend/infrastructure engineering roles.
Strong expertise in Kubernetes and cloud-native ecosystems.
Solid software engineering background with proficiency in at least one language such as Go, Python, JavaScript, C++, or Rust.
Deep experience with observability tools and practices (metrics, logging, tracing, alerting).
Strong understanding of networking concepts, including proxies, CDNs (e.g., Cloudflare), load balancing, and WAFs.
Hands-on experience with multi-cloud environments and virtual networking architectures.
Strong CI/CD experience and familiarity with modern deployment pipelines.
Experience working with distributed systems, queue-based architectures, and sharding patterns.
Strong problem-solving skills with a systems-thinking mindset and attention to reliability and security.
Exposure to security concepts such as attack vectors and botnet mitigation is a plus.

Benefits

Fully remote work with flexible hours
Global, distributed team with highly experienced engineers
High-impact systems serving millions of users at global scale
Modern engineering culture focused on fast iteration and continuous delivery
Opportunity to work on cutting-edge infrastructure, security, and AI-driven systems
Flat structure with direct collaboration across engineering teams
Competitive and mission-driven environment.

DevOps pay context

Based on 1,225 disclosed DevOps salaries on RoleSuite, the role pays a median of $140K/year, with most offers between $115K and $172K (10th–90th percentile: $100K–$210K).

See the full DevOps salary breakdown →

Apply →

Senior Site Reliability Engineer

Accountabilities:

Requirements

Benefits

DevOps pay context

Other roles at Jobgether

More DevOps roles