DevOpsJobs
RoleSuite
CompaniesRemoteAboutMethodologyContactPrivacy
Updated 2026-06-18 21:00 UTC·© 2025–2026 RoleSuite
← Back to listings

Senior HPC Cluster Engineer

Jobgether · France

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior HPC Cluster Engineer based in France.

This role sits at the core of a next-generation AI cloud infrastructure environment, focused on building and optimizing large-scale high-performance computing systems. You will work on complex GPU and InfiniBand cluster architectures that power AI and HPC workloads at scale. The position involves deep system-level engineering, performance tuning, and hands-on troubleshooting across distributed infrastructure. You will contribute directly to improving reliability, efficiency, and scalability of compute platforms used for advanced AI and data-intensive applications. Working in a highly technical engineering culture, you will collaborate with experts across systems, networking, and virtualization. This is a high-impact role where your work directly influences the performance of large-scale cloud and AI workloads.

Accountabilities:

Own the performance optimization and reliability of large-scale GPU clusters and InfiniBand networking environments supporting HPC workloads:

  • Tune and optimize GPU cluster performance and InfiniBand fabric efficiency to ensure high throughput and low-latency computing.
  • Diagnose, troubleshoot, and resolve complex system-level issues across GPU, network, and compute layers.
  • Integrate and validate new hardware components into existing HPC infrastructure, including support for GPUs and related accelerators.
  • Work across virtualization and orchestration layers (KVM/QEMU, Kubernetes) to ensure seamless hardware utilization and deployment.
  • Develop and improve automation for monitoring, fault detection, and proactive remediation in distributed compute environments.
  • Configure, manage, and maintain GPU devices, PCIe systems, and InfiniBand networks to ensure stability and scalability.
  • Requirements:

    We are looking for a highly experienced systems engineer with strong expertise in HPC and low-level infrastructure:

    • 5+ years of experience in system-level software engineering with a focus on performance, scalability, or infrastructure optimization.
    • 3+ years of hands-on experience with Linux systems administration, debugging, and performance tuning.
    • Strong understanding of server and hardware architecture including PCIe, NICs, GPUs, and Linux kernel-level behavior.
    • Proficiency in C, C++, Go, or Python for systems or performance-oriented development.
    • Experience working with distributed or HPC environments and solving complex infrastructure challenges.
    • Strong analytical and problem-solving skills with the ability to work on deep technical issues independently.
    • Familiarity with GPU clusters, InfiniBand networking, and large-scale compute systems is highly desirable.
    • Experience with KVM/QEMU or containerized orchestration environments is a plus.
    • Exposure to distributed computing frameworks or libraries such as MPI or NCCL is advantageous.
    • Benefits:

      • Competitive compensation package.
      • Career development and continuous learning opportunities in advanced AI and HPC systems.
      • Flexible working arrangements and remote-friendly culture across Europe.
      • Opportunity to work on cutting-edge AI infrastructure and large-scale distributed systems.
      • Collaborative engineering environment with high technical ownership.
      • Exposure to international teams and world-class engineering challenges.

DevOps pay context

Based on 1,233 disclosed DevOps salaries on RoleSuite, the role pays a median of $142K/year, with most offers between $115K and $174K (10th–90th percentile: $100K–$211K).

See the full DevOps salary breakdown →
Apply →

Other roles at Jobgether

  • Technical Communications & Storytelling LeadUS
  • Federal Principal Presales ArchitectUS
  • Senior Automation Project ManagerUS
  • Principal Architect/Solution Architect Senior Director (Databricks)US
  • Talent Operations - Program ManagerUS
  • Senior Compliance & Quality Assurance ConsultantUS
  • Senior React / Python / AWS Engineer – AI & Generative AI SolutionsSwitzerland
  • Senior React / Python / AWS Engineer – AI & Generative AI SolutionsFrance
  • Founding Lead Engineer / Principal Systems ArchitectUS
  • Senior React / Python / AWS Engineer – AI & Generative AI SolutionsGermany

More DevOps roles

  • Site Reliability Engineer II, tvScientificPinterest · San Francisco, CA, US; Remote, US
  • Senior DevOps EngineerNICE · USA - Sandy, UT
  • Salesforce DevOps AnalystAledade · Washington DC
  • Network EngineerSchonfeld Strategic Advisors · Singapore, Central, Singapore
  • Linux Administrator IICesium Astro · Westminster, CO
  • Senior DevOps EngineerFive9 · India, Chennai (Hybrid)
  • Design Release Engineer Interior SeatsALTEN Technology USA · Auburn Hills, Michigan, United States
  • Senior Devops EngineerWeekdayworks · Gurugram / Delhi NCR
  • Automation Infrastructure Engineer Catonetworks · Tel Aviv District, Israel
  • Senior Cloud Engineer Teads1 · Montpellier, Paris, Ljubljana