Engineer II - Site Reliability (Hybrid, IND)

CrowdStrike · India - Bangalore

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About This Role:

CrowdStrike's engineering organization depends on shared infrastructure platforms that power critical product capabilities. The Temporal Platform team owns a production workflow orchestration system that serves engineering teams across the organization.

You'll help operate and evolve our internal Temporal infrastructure, a stateful, distributed system running on Kubernetes across multiple regions. The work spans day to day operations, automation, performance tuning and capacity planning. You'll learn how to run complex infrastructure at scale while working alongside experienced platform engineers who will help you grow into broader ownership over time.

This is a growth oriented role. We're looking for someone early in their platform engineering journey who's ready to build operational depth, develop automation skills and understand what it takes to run production infrastructure that teams depend on.

What You'll Do:

Operate Temporal infrastructure in production - deploy updates, monitor cluster health, respond to alerts, and maintain availability across multiple environments using Helm, Kubernetes and FluxCD
Automate operational work - write scripts and workflows that make deployments, upgrades, scaling operations, and troubleshooting repeatable and safe; reduce manual toil over time
Support capacity planning and performance tuning - track resource utilization, identify bottlenecks, tune configuration for better performance and contribute to capacity forecasts under guidance
Build observability - instrument services with metrics and logs, improve dashboards, and refine alerting so the team can catch problems before they impact users
Contribute to on call rotation - participate in incident response, learn how to triage and escalate issues effectively, write runbooks that help the next person on-call
Learn GitOps workflows - work with FluxCD to manage infrastructure-as-code, submit pull requests for configuration changes, and understand how declarative deployment pipelines work
Troubleshoot operational issues - investigate deployment failures, connectivity problems, performance degradations, and work with teammates to determine root cause and preventive fixes
Partner with consuming teams - help internal engineers onboard to Temporal, answer questions, debug integration issues, and contribute to documentation that makes adoption easier
Grow your infrastructure skills - work with PostgreSQL, AWS/GCP, Kubernetes networking, Helm chart management, certificate rotation, secret management and distributed systems operations under mentorship

What You'll Need:

3+ years in DevOps, SRE, platform engineering or infrastructure roles - you've worked on production systems and understand the basics of running services reliably
Kubernetes fundamentals - you've deployed services to Kubernetes, understand pods/deployments/services, and can debug basic cluster issues; you don't need deep expertise but should be comfortable navigating kubectl and reviewing YAML
Helm experience - you've used Helm to deploy applications, understand charts and values files, and can troubleshoot failed releases
Some infrastructure-as-code experience - you've used tools like Terraform, Ansible, or GitOps workflows (FluxCD, ArgoCD) to manage infrastructure declaratively rather than clicking in consoles
Cloud platform exposure - you've worked with AWS or GCP in some capacity; you understand basic compute, networking, and storage primitives but don't need to be an expert
Scripting ability - you can write scripts (Bash, Python, Go) to automate repetitive tasks and build simple tooling
Basic understanding of stateful systems - you've worked with databases (PostgreSQL preferred) or other persistent services and understand backups, schema management, and connection handling at a foundational level
Willingness to learn and ask for help - you're comfortable saying "I don't know" and diving into unfamiliar territory with support from teammates

What Success Looks Like:

In your first few months:

You can deploy Temporal upgrades across environments with confidence
You've automated at least one recurring operational task
You respond to on-call pages effectively and write clear incident summaries
You've contributed meaningful improvements to dashboards or runbooks
Internal teams reach out to you directly for help with Temporal questions

Over your first year:

You own end-to-end operations for specific Temporal components or environments
You proactively identify performance issues and propose tuning strategies
You're contributing to capacity planning and cost optimization discussions
You're helping onboard new engineers to the team's operational practices

Bonus Points:

Experience operating workflow orchestration platforms (Temporal, Airflow, Prefect, Cadence)
Experience with FluxCD or ArgoCD in production
Exposure to distributed tracing or observability platforms
Go experience (our services and many consuming applications are written in Go)
Previous work on internal platform teams or DevOps infrastructure roles
Understanding of PostgreSQL performance tuning and operational best practices
Familiarity with multi-region infrastructure deployment and failover patterns

#LI-SM2

Benefits of Working at CrowdStrike:

Market leader in compensation and equity awards
Comprehensive physical and mental wellness programs
Competitive vacation and holidays for recharge
Paid parental and adoption leaves
Professional development opportunities for all employees regardless of level or role
Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
Vibrant office culture with world class amenities
Great Place to Work Certified™ across the globe

CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

DevOps pay context

Based on 1,239 disclosed DevOps salaries on RoleSuite, the role pays a median of $141K/year, with most offers between $115K and $173K (10th–90th percentile: $101K–$210K).

See the full DevOps salary breakdown →

Apply →

Engineer II - Site Reliability (Hybrid, IND)

DevOps pay context

Other roles at CrowdStrike

More DevOps roles