Amazon Web Services (AWS)

Senior Applied Scientist - Systems for ML Inference and Training Optimization, Deep Science for[...]

Amazon Web Services (AWS) Location Not Available

Stellenbeschreibung:

Senior Applied Scientist – Systems for ML Inference and Training Optimization, Deep Science for Systems and Services

We are seeking an exceptional Senior Applied Scientist specializing in ML systems, training, and inference optimization to join DS3. This role requires deep expertise in performance engineering, kernel development, distributed systems optimization, and AI workload optimization across heterogeneous compute platforms. You will invent and implement novel optimization techniques that directly impact the performance and cost-efficiency of ML training and inference for AWS customers worldwide.

Key Job Responsibilities

  • Design and implement novel kernel‑level optimizations for ML inference and training workloads, including custom CUDA kernels, PTX‑level optimizations, and cross‑platform acceleration for CUDA and AWS Neuron SDK.
  • Drive 2‑10× performance improvements in latency, throughput, and memory efficiency for production ML inference & training systems through systematic profiling, analysis, and optimization.
  • Develop and port high‑performance ML operators across GPUs, AWS Inferentia/Trainium, and emerging AI accelerators, ensuring optimal performance on each platform.
  • Lead the design, implementation, and delivery of scientifically‑complex optimization solutions that directly improve customer experience and reduce AWS operational costs at scale.
  • Produce technical documentation and internal research reports demonstrating the correctness, efficiency, and scalability of your optimizations; contribute to external publications when aligned with business needs.
  • Influence your team’s technical direction and scientific roadmap; build consensus across engineering and science teams on optimization strategies and architectural decisions.
  • Actively mentor junior scientists and engineers on performance engineering best practices, kernel development, and systems‑level optimization techniques.

Basic Qualifications

  • PhD in Computer Science, Computer Engineering, or related field, OR Master’s degree with 8+ additional years of relevant research/industry experience.
  • 5+ years of hands‑on experience in performance optimization and systems programming for AI/ML workloads.
  • Expert‑level proficiency in CUDA programming and GPU architecture, with demonstrated ability to write high‑performance custom kernels.
  • Proven track record of delivering measurable performance improvements (2× or greater) in production systems.
  • Strong C/C++ programming skills with experience in performance profiling tools such as NVIDIA Nsight, Linux Perf, or similar diagnostic frameworks.

Preferred Qualifications

  • Experience optimizing inference and/or training for large language models (LLMs) and transformer‑based architectures, including MoE models, at scale.
  • Hands‑on experience with AWS Neuron SDK or other non‑NVIDIA AI acceleration platforms.
  • Track record of optimizing ML workloads across diverse hardware: embedded devices (ARM Cortex, DSPs, NPUs) and data center GPUs (NVIDIA Ampere/Hopper).
  • Experience with low‑level optimization techniques including assembly‑level tuning (NVIDIA PTX, x86/ARM assembly) and cross‑platform kernel porting.
  • Experience leading performance optimization initiatives that resulted in significant cost savings or multi‑million dollar business impact.
  • Proven ability to mentor and train engineers in performance engineering and low‑level optimization (5+ team members or workshop instruction).
  • Entrepreneurial experience or track record of driving technical vision in startup, co‑founder, or product development environments.

About AWS

AWS values diverse experiences and encourages candidates to apply even if they do not meet every qualification. Amazon is an equal‑opportunity employer. We uphold privacy and data security as a long‑standing priority. For more information, see our Privacy Notice: 

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    02 Jan 2026
  • Standort:

  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!

Diese Jobs passen zu Deiner Suche:

Zalando
Senior Applied Scientist - Size & Fit (all genders)
Zalando
Vollzeit WorkFromHome
01 Jan 2026
Preply
Senior II Applied AI Scientist
Preply
Vollzeit Berlin
01 Jan 2026
company logo
Senior Applied Scientist- Risk Management (all genders)
Zalando GmbH
Vollzeit WorkFromHome
01 Jan 2026
Sandoz in Deutschland
Senior Expert Science and Technology / Principle Scientist (m/w/d) befristet bis Oktober 2027
Sandoz in Deutschland
Vollzeit Holzkirchen
01 Jan 2026
SAP
Senior Developer (f/m/d) - Supply Chain Planning Heuristic Team - SCM Optimization & Data Science
SAP
Vollzeit Garching bei München
01 Jan 2026
company logo
Senior Applied Scientist (GenAI & Computer Vision) - Size & Fit (All Genders)
Zalando GmbH
Vollzeit WorkFromHome
01 Jan 2026
company logo
Senior Applied Scientist d/f/m
RWE Gruppe
Vollzeit Essen
11 Jan 2026
company logo
Senior Applied Scientist d/w/m
RWE Gruppe
Vollzeit Essen
11 Jan 2026