Senior Site Reliability Engineer (SRE)
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Site Reliability Engineer (SRE) based in Italy.
This role sits at the heart of large-scale cloud infrastructure, ensuring that highly distributed systems remain reliable, scalable, and performant under demanding production workloads.
You will be responsible for maintaining and improving the stability of critical services that support modern AI and cloud-native platforms.
The environment is fast-paced and engineering-driven, where automation, resilience, and operational excellence are core priorities.
You will work closely with software, infrastructure, and platform teams to design systems that can withstand high traffic and complex distributed workloads.
The role combines hands-on engineering with strategic improvements to CI/CD pipelines, observability, and system reliability.
You will contribute to shaping infrastructure that enables seamless deployment and operation of advanced cloud services at global scale.
Accountabilities:
- Maintain high system availability by ensuring fault tolerance, monitoring, and rapid incident response across production services.
- Design, implement, and optimize scalable infrastructure solutions using modern cloud-native technologies.
- Improve and evolve CI/CD pipelines to enable safe, efficient, and automated software delivery.
- Collaborate with engineering teams to troubleshoot complex system issues across compute, networking, and storage layers.
- Apply infrastructure-as-code practices using tools such as Terraform, Ansible, or similar to manage and standardize environments.
- Support containerized environments and orchestration platforms such as Docker, Kubernetes, and Helm.
- Contribute to operational best practices, including observability, alerting, and performance tuning.
- Strong programming skills in languages such as Go, Python, or C++, with a solid foundation in algorithms and data structures.
- Deep understanding of Unix/Linux systems, networking fundamentals, and distributed system behavior.
- Hands-on experience with containerization and orchestration tools such as Docker and Kubernetes.
- Practical experience with infrastructure-as-code and configuration management tools (Terraform, Ansible, Salt, or similar).
- Familiarity with CI/CD systems and modern DevOps practices.
- Experience working with or supporting high-load distributed systems in production environments.
- Strong problem-solving mindset with the ability to diagnose and resolve complex technical issues.
- Excellent communication and collaboration skills in cross-functional engineering teams.
- Competitive compensation package
- Career growth and continuous learning opportunities
- High degree of autonomy, flexibility, and ownership
- Collaborative and innovation-focused engineering culture
- Opportunity to work on large-scale, impactful cloud and AI infrastructure
- International environment with highly skilled engineering teams
Requirements:
Benefits:
DevOps pay context
Based on 1,094 disclosed DevOps salaries on RoleSuite, the role pays a median of $142K/year, with most offers between $115K and $176K (10th–90th percentile: $99K–$210K).
See the full DevOps salary breakdown →