We are now looking for a Senior GPU Architect, Deep Learning! The NVIDIA GPU Architecture group is looking for a world class hardware architect to help define and drive future GPU architectures for deep learning and accelerated computing. In this role, you will work at the earliest stages of product definition, with primary focus on defining architectural features for future GPUs, while also contributing to relevant microarchitectural direction and tradeoffs. You will help drive new capabilities from concept through modeling, design collaboration, validation, and silicon readiness.
A key part of NVIDIA’s strength is our ability to translate workload insight into differentiated hardware. We are constantly looking for ways to improve our GPU architecture for training and inference workloads while advancing performance, efficiency, scalability, and programmability. In this position, you will focus primarily on defining new architectural features, while also shaping the supporting microarchitectural direction, evaluating design alternatives, and partnering closely with ASIC design, verification, performance, and software teams to bring new capabilities into future products.
What you'll be doing:
Define and architect new GPU hardware features for future deep learning and parallel processing workloads.
Drive microarchitectural exploration across key areas such as compute pipelines, memory hierarchy, data movement, synchronization, and performance efficiency.
Analyze workload behavior and translate bottlenecks into clear architectural requirements and hardware feature proposals.
Evaluate performance, power, area, complexity, and programmability tradeoffs for new architectural directions.
Develop and use functional and performance models to study new features and refine the architecture before implementation.
Work closely with RTL, design, verification, compiler, and software teams to ensure successful execution from architecture definition to productization.
Create clear architecture specifications, validation plans, and success criteria for the features you define.
Be ready to learn, dig deep, and work across the full stack when required - from workloads and models to RTL and silicon.
What we need to see:
BS, MS, or PhD in Computer Science, Electrical Engineering, Computer Engineering, or equivalent experience.
12+ years of relevant industry experience in GPU architecture, computer architecture, or other parallel processing architectures.
Strong background in hardware architecture and microarchitecture.
Experience defining and evaluating architectural features with solid understanding of performance, power, and area tradeoffs.
Strong programming and scripting skills in C, C++, and Python.
Experience with architectural modeling, simulation, or performance analysis.
Background in parallel computing, memory systems, high performance computing, or deep learning acceleration.
Strong communication skills and the ability to drive technical work across distributed, interdisciplinary teams.
Ways to stand out from the crowd:
Deep understanding of modern GPU architecture and the interaction between hardware and AI workloads.
Experience with memory subsystem architecture, interconnects, coherence, scheduling, or execution pipelines.
Experience with pre-silicon performance studies, workload characterization, and architectural correlation.
Familiarity with training and inference behavior for large-scale deep learning models.
Experience with silicon bring-up, debug, or post-silicon analysis.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, autonomous, and love a challenge, consider joining our GPU Architecture team and help us build the next generation of AI computing platforms.
Based on 226 disclosed Semiconductor salaries on RoleSuite, the role pays a median of $178K/year, with most offers between $148K and $205K (10th–90th percentile: $132K–$236K).
See the full Semiconductor salary breakdown →