Agile Robots SE

Senior ML Platform Engineer (m/f/d)

Agile Robots SE WorkFromHome

Stellenbeschreibung:

About the role

As a Senior ML Platform Engineer, you will take ownership of designing, building, and maintaining the infrastructure that powers our Generative AI framework. You will architect scalable, secure ML environments across development, testing, and production, while automating the entire ML lifecycle—from data ingestion and model training to deployment and monitoring.

Your Responsibilities

  • Design, build, and maintain scalable and secure ML infrastructure across development, testing, and production environments
  • Automate and optimize the ML lifecycle
  • Architect and manage the continuous integration and deployment pipelines and release processes using tools such as Kubeflow, MLflow, or custom Kubernetes solutions
  • Implement monitoring systems for data drift, model performance, and infrastructure health
  • Develop tooling: Build and enhance ML engineering tooling for Model Development, Model Workbench, Model Training, Model Monitoring, and Model serving
  • Work closely with data scientists and ML engineers to ensure reproducibility, scalability, and production-readiness of models
  • Design and maintain pipelines for feature extraction, transformation, and storage using tools like Feature Store or custom solutions
  • Ensure data quality, consistency, and lineage throughout the ML pipeline
  • Ensure responsible use of data, model explainability, and auditability in line with organizational and legal standards

Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
  • 5+ years in DevOps/software engineering/infrastructure, with 2–3 years in MLOps or production ML systems
  • Proficiency in Python, Bash or Go is a plus
  • Model versioning, monitoring, and lifecycle management
  • Strong with ML tooling (MLflow, Kubeflow, Airflow, SageMaker, etc.)
  • Infrastructure-as-Code (Terraform; Pulumi plus) and CI/CD pipelines (GitLab CI, Jenkins, ArgoCD)
  • Cloud platforms (AWS, GCP, or Azure)
  • Proven track record with GPU-accelerated systems at scale
  • Experience with cluster/cloud compute technologies (SLURM, Lustre, k8s)
  • Benchmarking expertise (software & hardware)
  • Strong problem‑solving/debugging with the ability to lead infra initiatives independently
  • Excellent communication and cross‑functional collaboration

What we offer

  • A dynamic, highly, and diverse team in which your contributions are reflected directly in our products and used by our international customer base
  • Flat hierarchies and short decision‑making processes
  • Exciting and varying tasks for our product portfolio
  • Excellent working environment, modern office space, and flexible working hours with the option of mobile working

We are proud of our diversity and welcome your application regardless of gender and sexual identity, nationality, ethnicity, religion, age, or disability.

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt
  • Veröffentlichungsdatum:

    01 Dez 2025
  • Standort:

    WorkFromHome

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!