Jobs / NVI***
Principal Software Engineer, Rack-Scale System Software — CSP Engagements
NVI*** · Austin, TX, United States
Visa sponsorship details are locked. Unlock company name and apply link with .
Austin, TX, United States272,000-431,250 USD/yearlyRemote
Remuneration
272,000-431,250 USD/yearly
Location
Austin, TX, United States
Visa sponsorship
Sponsors visa
Job summary
We're looking for a Principal Software Engineer to join our CSP Engagements team as the technical focal point for rack-scale system SW/FW, working with CSP engineering teams to ensure they can deploy, monitor, and operate these systems reliably at fleet scale.
Benefits
Applications for this job will be accepted at least until June 30, 2026.This posting is for an existing vacancy.NVIDIA uses AI
Qualifications
- are reflected in system software and firmware development
- Identify cross-CSP patterns in rack-scale SW/FW issues, error handling behavior, and system configuration practices — drive documentation, tooling, and test strategy improvements as a result
- Collaborate with execution teams on left-shift strategy — ensuring customer-side SW/FW integration work is identified early and completed ahead of hardware availability
- Make critical technical decisions on rack-scale system SW/FW tradeoffs and mitigate execution risks through early engagement with CSP engineering teams
- What we need to see:
- 15+ years of experience in system software, platform firmware, or large-scale distributed systems engineering.
- BS or MS in Computer Science, Electrical Engineering, or related field (or equivalent experience)
- Deep understanding of rack-scale system software challenges: multi-component coordination, error propagation, health monitoring, and serviceability / reliability
- Experience with fabric management software, cluster management, or system-level orchestration frameworks.
- Familiarity with firmware architectures and update lifecycle management (multi-component update sequencing, rollback, recovery)
- Understanding of error handling and recovery design patterns in distributed systems — fault isolation, retry policies, graceful degradation
- Experience with health monitoring and telemetry systems: health scoring, event correlation, API design for fleet-level observability
Responsibilities
- In this role, you will collaborate with NVI***'s cross-functional rack-scale system SW/FW engineering teams with dedicated CSP-facing technical leadership.
- You will drive work streams with CSP engineering teams to build shared understanding of the architecture, incorporate their operational feedback, and ensure integration readiness.
- What you'll be doing:
Skills
CommunicationLeadership
Degrees
AssociateBachelor
Industry
EnergyGaming
Company size
EnterpriseSmb