Jobs / Ama***
ML Kernel Performance Engineer, Edge AI and Science
Ama*** · Sunnyvale, CA, United States
Visa sponsorship details are locked. Unlock company name and apply link with .
Sunnyvale, CA, United States165,200-223,600 USD/yearlyOnsite
Remuneration
165,200-223,600 USD/yearly
Location
Sunnyvale, CA, United States
Visa sponsorship
Sponsors visa
Job summary
DESCRIPTION Amazon Devices is an inventive research and development company that designs and engineers high-profile consumer products like the Kindle family, Fire Tablets, Fire TV, Health & Wellness devices, Amazon Echo, and Astro. We are building the next generation of edge AI capabilities through our advanced compression platform and custom neural accelerator silicon.
Benefits
Learn more about ourAt https://amazon.jobs/en/USA, CA, Sunnyvale - 165,200.00 - 223,600.00 USD annually
Qualifications
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Knowledge of Python and/or C++ programming
- Experience with CUDA kernels or ML/low-level kernels, or experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware
- PREFERRED
- Bachelor's degree in computer science or equivalent
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Experience with GPU kernel optimization and GPGPU computing (CUDA, Triton, SYCL, or ROCm)
- Proficiency in low-level performance optimization for GPUs
- Understanding of GPU memory hierarchies and optimization strategies (shared memory, L1/L2 cache, register pressure, memory coalescing)
- Experience developing high-performance libraries for ML or HPC applications
- Knowledge of ML frameworks (PyTorch, TensorFlow) and their GPU backends
Responsibilities
- Design and implement high-performance CUDA and Triton kernels for quantization-aware training, sparse matrix operations, and low-bit inference on modern GPU accelerators
- Analyze and optimize kernel-level performance for compression training workloads, conducting detailed performance analysis using profiling
- Criminal history may have a direct, adverse, and negative relationship with some of the material job
- of this position.
- These include the
- Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
- Our inclusive culture empowers Amazonians to deliver the best results for our customers.
- If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
- The base salary range for this position is listed below.
- Your Amazon package will include sign-on payments and restricted stock units (RSUs).
- Final compensation will be determined based on factors including experience,
Skills
Customer Service
Degrees
AssociateBachelorDegree
Work schedule
Shift
Industry
AutomotiveEnergyInsurancePublic-sector
Company size
SmbStartup