Aignostics

Engineering Manager - Machine Learning (d/f/m)

Stellenbeschreibung:

Why us?

We believe that AI has the potential to revolutionize how cancer and other complex diseases are diagnosed and treated. We also believe that AI is a tool, not an identity – without access to high quality data and a scientifically rigorous, transparent approach to model development, AI is just a buzzword. That's where we come in.

Aignostics is a spin-off from one of Europe's largest and most prestigious university hospitals (Charité), with employees in Berlin and New York. We have received over $50M in funding from leading investors and are a growing team of over 100 interdisciplinary professionals. We work with academic partners as well as leading global life sciences companies.

As a

ML Engineering Team Lead

at Aignostics, you will lead a high-performing team focused on building large-scale distributed training infrastructure and workflows using cutting-edge technologies for digital pathology, powering our state-of-the-art Foundational Model development. This is a

hands‑on leadership role

where you'll spend approximately 50% of your time on technical contributions while guiding your team to push the boundaries of machine learning for cancer research and diagnostics. You'll own the full employee lifecycle for your team, drive technical roadmapping and ensure operational excellence while fostering a culture of autonomy and innovation.

At Aignostics, we believe that fighting cancer is a job for people of all identities, backgrounds, and cultures. We value and celebrate diversity and inclusion and are committed to offering equal employment and promotion opportunities for all applicants and employees. Applicants will be considered regardless of their age, disability, ethnicity, race, gender identity or expression, sexual orientation, religion, etc. We thrive through collaboration and believe the more inclusive we are, the better our work.

Where your expertise is needed

People & Team Leadership

Build and scale a high-performing team capable of tackling complex distributed ML challenges

Own the full employee lifecycle: recruiting, onboarding, performance management, career development, and retention

Empower your team members and help them grow in autonomy and technical expertise

Mentor engineers at all levels, fostering a culture of continuous learning and psychological safety

Create an inclusive environment where diverse perspectives drive innovation

Strategic & Operational Management

Define and execute technical roadmaps aligned with company objectives and product needs

Lead resource allocation and capacity planning to balance team workload and business priorities

Own FinOps responsibilities: optimize cloud costs, track spending, and ensure efficient resource utilization

Ensure operational readiness through monitoring, incident response protocols, and system reliability practices

Establish and track KPIs for team performance, system efficiency and health

Technical Leadership

Design, develop, and maintain robust large-scale distributed training pipelines and ML infrastructure using cutting‑edge technologies

Lead architecture decisions for distributed systems that enable efficient model development at scale

Hands‑on contribution to critical technical challenges, including optimization of training pipelines and infrastructure

Drive technical excellence through code reviews and architectural guidance

Stay at the forefront of distributed training technologies and bring innovation to the team

Cross‑functional Collaboration

Partner closely with Product teams to translate business requirements into technical solutions

Collaborate with (senior) Research Scientists to enable scalable model development and experimentation

Work with Platform Engineering to ensure robust infrastructure and tooling

Build strong relationships across engineering teams to drive alignment and knowledge sharing

Communicate technical concepts effectively to both technical and non‑technical stakeholders

What we are looking for

Required Skills

Bachelor's or Master's degree in Computer Science, Engineering, Mathematics, or a related field.

6+ years of software engineering or ML engineering experience, with at least 2 years in a technical leadership or team lead role

Proven track record of building and leading high‑performing engineering teams. Experience guiding projects across the whole Software Development Life Cycle, from requirements through design to implementation, deployment and maintenance.

Deep understanding of fundamental Machine Learning concepts and principles, familiarity with advanced model optimization techniques (such as distillation, graph optimization, quantization etc.)

Significant experience with large‑scale distributed training systems and frameworks (especially PyTorch and NCCL). Familiarity with GPUs, distributed systems, parallel computing and scaling laws.

Advanced programming skills in Python, experience in performance‑critical languages (C/C++ or CUDA) being a plus

Familiarity of MLOps/DevOps best practices including CI/CD, Docker, Kubernetes, and observability, cloud platforms (GCP, AWS or Azure) and infrastructure‑as‑code

Experience with Linux, version control, and container technologies

Demonstrated ability in resource allocation, capacity planning, and FinOps principles

Excellent problem‑solving and data‑driven decision‑making skills in ambiguous situations

Leadership & Soft Skills

Effective communication and stakeholder management skills

Ability to give constructive feedback and navigate difficult conversations

Proven people leadership skills with experience managing the full employee lifecycle

Strategic thinking with ability to balance short‑term execution and long‑term vision

Experience with agile methodologies and iterative development processes

Proven ability to influence without authority and build consensus across teams

Track record of empowering team members and fostering autonomy

Ideally, you also have

Experience with production systems in a regulated or healthcare environments, familiarity with medical device standards (ISO 13485)

Experience working with biomedical or image data

Hands‑on experience with Google Kubernetes Engine, SLURM and Ray distributed computing framework

Experience with advanced ML stack (TorchDyno, JAX, TensorRT)

Familiarity with Information Security standards (ISO 27001) in software development

Experience with FinOps tools and cloud cost optimization strategies

Demonstrated experience with leveraging LLM/Agentic systems to accelerate development

We are still keen to hear from you if you don't match all the above points! Our needs are diverse and growing and you are encouraged to apply if you have any combination of these skills. The recruitment process is a comparative exercise and decisions will be made based on the applications we review at each time.

Our offer

Join a purpose‑driven startup: We are working collectively to fight cancer and improve patient outcomes. Come help us make a difference!

Cutting‑edge AI research and development, with involvement of Charité, TU Berlin and our other partners

Work with a welcoming, diverse, and highly international team of colleagues

Opportunity to shape the technical direction and grow into broader leadership roles

Expand your skills by benefitting from our Learning & Development yearly budget of 1,000€ (plus 2 L&D days), language classes, and internal development programs

Access to leadership development programs and executive coaching

Flexible working hours and teleworking policy

Enjoy your well‑deserved time off within our 30 paid vacation days per year

We are family & pet friendly and support flexible parental leave options

Pick a subsidized membership of your choice among public transport, sports, and well‑being

Enjoy our social gatherings, lunches, and off‑site events for a fun and inclusive work environment

Optional company pension scheme

Join us to make a difference!

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    02 Mär 2026
  • Standort:

    Berlin
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!