Jobs / Net***

Staff ML Software Engineer (L6) — Platform Systems, AIMS Engineering

Net*** · Los Gatos, CA, United States
Visa sponsorship details are locked. Unlock company name and apply link with .
Los Gatos, CA, United States600,000-1,066,000 USD/yearlyRemote
Remuneration
600,000-1,066,000 USD/yearly
Location
Los Gatos, CA, United States
Visa sponsorship
Sponsors visa

Job summary

At Net***, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology.

Benefits

And Life and Serious InjuryWe also offer paid leave of absence programs.Full-time hourly employees accrue 35 days annually for paid time off to be usedFull-time salaried employees are immediately entitled to flexible time off.See more details about ourHere .Netflix is a unique culture and environment.Learn more here .Inclusion is a Netflix value and we strive to host a meaningful interview experiWe are an equal-opportunity employer and celebrate diversity, recognizing that dWe approach diversity and inclusion seriously and thoughtfully.

Qualifications

  • AI for Member Systems (AIMS) runs the AI systems behind every recommendation, search result, and personalized experience for 300M+ members.
  • Platform Systems is the engineering foundation of AIMS, owning reliability, scalability, cost efficiency, and developer experience across the org.
  • Experience with compute and cost optimization for AI/ML workloads at scale, including capacity management and efficiency tooling
  • Hands-on experience building GenAI-powered tooling for operational automation, root cause analysis, or anomaly detection in AI/ML systems
  • Experience building developer tooling or platform abstractions that improve AI/ML practitioner velocity
  • Applied experience in personalization domains such as recommendation systems, search, or discovery
  • Familiarity with modern AI/ML infrastructure patterns including feature stores, model serving platforms, and experiment frameworks
  • Generally, our compensation structure consists solely of an annual salary; we do not have bonuses.
  • You choose each year how much of your compensation you want in salary versus stock options.
  • To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background,

Responsibilities

  • Drive end-to-end migration of AIMS AI/ML systems onto a modern, Python-native platform, coordinating across multiple AIMS teams and external platform partners, with dozens of production models in flight
  • Build migration tooling and shared abstractions that reduce the cost of adoption for individual teams, so modernization does not require each team to solve the same problems independently
  • Own scalability across training throughput and data pipelines, ensuring AIMS AI/ML systems stay performant as model complexity and member traffic grow
  • Identify and drive cost optimization across AIMS training and serving infrastructure, developing frameworks and tooling that make compute efficiency a first-class concern, not an afterthought
  • Architect reliability improvements across the AIMS AI/ML stack, reducing toil, improving on-call ergonomics, and setting the standard for operational excellence across the org
  • Prototype and productionize GenAI-powered tooling for anomaly detection, root cause analysis, and operational automation, applying LLM-based systems to the problems of AI/ML reliability and cost at scale
  • Surface systemic cost, reliability, and migration gaps by embedding with AI/ML teams across AIMS, and translate their friction into concrete engineering investments with org-wide leverage
  • Set technical standards for the modernized stack and raise the engineering bar across AIMS through design reviews, architectural guidance, and leading by example
  • What We're Looking For
  • Significant experience designing, building, and operating large-scale production AI/ML systems, including training pipelines and familiarity with model serving and online inference at high-traffic scale
  • Hands-on experience migrating production AI/ML systems across technology generations; you have done this before and understand where it goes wrong
  • Strong software engineering fundamentals with deep Python expertise and working proficiency in at least one JVM language (Scala or Java)

Degrees

Associate

Work schedule

On-call

Industry

AutomotiveDefenseEnergyMediaOil-gas