DevOps Engineer - AI Automation Platform (Remote, China)

Bjakcareer · China

BJAK’s automation systems support customer journeys across quote generation, policy issuance, claims, payments, renewals and insurer integrations. Reliability matters because system issues quickly become customer and business issues.

We're looking for a DevOps Engineer based in China to strengthen the infrastructure, deployment systems and operational reliability behind BJAK’s AI automation platform.

This is a fully remote position where you'll collaborate closely with our Malaysia-based engineering, product and operations teams to ensure systems are stable, scalable and safe to operate.

The Mission

Help BJAK ship AI automation systems faster and operate them with fewer failures by building reliable infrastructure, deployment pipelines and production operations practices.

What You’ll Own

  • Manage cloud infrastructure, environments and deployment pipelines for production systems.

  • Design and improve CI/CD processes to make deployments safer, faster and more reliable.

  • Improve monitoring, alerting, logging and system observability across services.

  • Own uptime, incident response workflows and post-incident root cause analysis.

  • Work with engineers to reduce production risk through better release practices and infrastructure design.

  • Improve access control, secrets management and infrastructure security basics.

  • Support infrastructure for business-critical workflows across multiple countries and services.

  • Automate operational tasks to reduce manual intervention and human error.

  • Continuously improve system reliability, scalability and operational discipline.

What We're Looking For

  • Experience in DevOps, SRE, cloud infrastructure or platform engineering roles.

  • Strong understanding of CI/CD pipelines, cloud platforms and deployment strategies.

  • Experience with production monitoring, alerting and incident management.

  • Ability to troubleshoot infrastructure and production issues calmly and systematically.

  • Practical understanding of reliability, scalability, cost and security trade-offs.

  • Experience supporting business-critical systems in production environments.

  • Strong ownership mindset during incidents and operational failures.

  • Comfortable working closely with engineering teams in fast-moving environments.

  • Strong attention to detail and disciplined approach to production changes.

  • Low ego, structured thinking and focus on long-term system stability.

Bonus Points

  • Experience with AWS, GCP, Azure or similar cloud platforms.

  • Experience with Kubernetes, Docker or container orchestration systems.

  • Experience with infrastructure-as-code tools (Terraform, Ansible, etc.).

  • Experience with observability stacks (Prometheus, Grafana, ELK, Datadog, etc.).

  • Experience supporting high-traffic or distributed production systems.

  • Experience with zero-downtime deployments and blue-green/canary releases.

  • Knowledge of security best practices in cloud and infrastructure environments.

  • Experience working in fintech, insurance or other regulated industries.

  • Contributions to platform reliability or infrastructure standardization efforts.

The Kind of Builder We Want

  • Calm under pressure, especially during production incidents.

  • Hands-on with infrastructure and not afraid of low-level system details.

  • Thinks in failure modes, risks and recovery paths.

  • Careful and deliberate when making production changes.

  • Strong focus on reliability, observability and operational discipline.

  • Actively prevents issues, not just reacts to them.

  • Builds systems that engineers can deploy and operate with confidence.

This Role Is Not For

  • People who only react after systems fail instead of preventing issues.

  • Engineers who are careless with production access or deployment changes.

  • Individuals who ignore monitoring, alerting or operational discipline.

  • People who make risky infrastructure changes without proper analysis.

  • Candidates who cannot stay calm during production incidents.

Success in This Role

You'll be successful if you can:

  • Improve deployment safety, speed and reliability across systems.

  • Reduce production incidents and infrastructure-related failures.

  • Strengthen monitoring, alerting and system visibility.

  • Enable engineers to ship with confidence and lower operational risk.

  • Improve overall stability of BJAK’s AI automation platform as it scales.

Why Join BJAK

  • Build Reliable AI Infrastructure – Support systems powering end-to-end insurance automation.

  • High-Impact Engineering – Solve real-world reliability and scaling challenges.

  • Global Engineering Team – Work with experienced engineers across multiple countries.

  • Fully Remote – Work remotely from China while collaborating with our Malaysia-based teams.

  • International Exposure – Build systems used across Southeast Asia markets.

  • Learning & Development Budget – Support continuous technical growth and certifications.

  • High Ownership Environment – Strong autonomy over infrastructure and reliability practices.

  • Modern Engineering Culture – Focus on stability, speed and operational excellence.

  • Competitive Compensation – Attractive salary package based on experience and impact.

Interview Process

We assess infrastructure knowledge, incident thinking and production problem-solving ability. The process usually includes application review, two interviews and a technical scenario or systems discussion.

DevOps pay context

Based on 1,253 disclosed DevOps salaries on RoleSuite, the role pays a median of $141K/year, with most offers between $115K and $173K (10th–90th percentile: $100K–$210K).

See the full DevOps salary breakdown →
Apply →