Senior Site Reliability Engineer
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Site Reliability Engineer based in Brazil.
This role sits at the core of large-scale, internet-facing systems where performance, reliability, and security are mission-critical. You will help design and operate infrastructure that serves millions of users and handles massive request volumes across distributed environments. The position combines deep systems engineering with hands-on software development, focusing on building resilient, observable, and cost-efficient platforms. You will work across cloud-native stacks, collaborating with engineering teams to improve uptime, deployment velocity, and system robustness. The environment is fast-paced, highly technical, and driven by continuous iteration and data-based decision-making. This is a role for engineers who enjoy solving complex distributed systems challenges at scale while directly impacting end-user experience.
Accountabilities:
- Design, build, and operate large-scale distributed systems supporting high-throughput, low-latency services across multi-cloud environments.
- Improve system reliability, scalability, performance, and cost efficiency across infrastructure, applications, and networking layers.
- Develop and maintain Kubernetes-based infrastructure and automation to support engineering teams and production workloads.
- Implement and enhance observability solutions, including monitoring, logging, tracing, and alerting across systems and services.
- Work closely with CI/CD pipelines and deployment systems to ensure safe, fast, and reliable software delivery.
- Diagnose and resolve complex production issues spanning infrastructure, networking, and application layers.
- Contribute to architectural decisions involving distributed systems, security, and high-availability design.
- 6+ years of experience in Site Reliability Engineering, DevOps, or backend/infrastructure engineering roles.
- Strong expertise in Kubernetes and cloud-native ecosystems.
- Solid software engineering background with proficiency in at least one language such as Go, Python, JavaScript, C++, or Rust.
- Deep experience with observability tools and practices (metrics, logging, tracing, alerting).
- Strong understanding of networking concepts, including proxies, CDNs (e.g., Cloudflare), load balancing, and WAFs.
- Hands-on experience with multi-cloud environments and virtual networking architectures.
- Strong CI/CD experience and familiarity with modern deployment pipelines.
- Experience working with distributed systems, queue-based architectures, and sharding patterns.
- Strong problem-solving skills with a systems-thinking mindset and attention to reliability and security.
- Exposure to security concepts such as attack vectors and botnet mitigation is a plus.
- Fully remote work with flexible hours
- Global, distributed team with highly experienced engineers
- High-impact systems serving millions of users at global scale
- Modern engineering culture focused on fast iteration and continuous delivery
- Opportunity to work on cutting-edge infrastructure, security, and AI-driven systems
- Flat structure with direct collaboration across engineering teams
- Competitive and mission-driven environment.
Requirements
Benefits
DevOps pay context
Based on 1,225 disclosed DevOps salaries on RoleSuite, the role pays a median of $140K/year, with most offers between $115K and $172K (10th–90th percentile: $100K–$210K).
See the full DevOps salary breakdown →