Senior MLOps Platform Architect

Stellenbeschreibung:

Remote | B2B Contract | Europe

Role Overview

We are hiring a senior MLOps who can build an entire AI platform infrastructure end-to-end. This is not a research role and not a standard ML Engineer role. If you haven’t designed production‑grade MLOps infrastructure, haven’t built CI/CD for ML, or haven’t deployed ML workloads on Kubernetes at scale, this role is not a fit.

You will design, build, and own the AWS‑based infrastructure, Kubernetes platform, CI/CD pipelines, and observability stack that supports our AI models (Agentic AI, NLU, ASR, Voice Biometrics, TTS). You will be the technical owner of MLOps infrastructure decisions, patterns, and standards.

Key Responsibilities:

MLOps Platform Architecture (from scratch)

  • Design and build AWS-based AI/ML infrastructure using Terraform (required) .
  • Define standards for security, automation, cost efficiency, and governance.
  • Architect infrastructure for ML workloads, GPU/accelerators, scaling, and high availability.

Kubernetes & Model Deployment

  • Architect, build, and operate production Kubernetes clusters.
  • Containerize and productize ML models (Docker, Helm).
  • Deploy latency‑sensitive and high‑throughput models (ASR/TTS/NLU/Agentic AI).
  • Ensure GPU and accelerator nodes are properly integrated and optimized.

CI/CD for Machine Learning

  • Build automated training, validation, and deployment pipelines (GitLab/Jenkins).
  • Implement canary, blue‑green, and automated rollback strategies.
  • Integrate MLOps lifecycle tools (MLflow, Kubeflow, SageMaker Model Registry, etc.)

Observability & Reliability

  • Implement full observability (Prometheus + Grafana).
  • Own uptime, performance, and reliability for ML production services.
  • Establish monitoring for latency, drift, model health, and infrastructure health.

Collaboration & Technical Leadership

  • Work closely with ML engineers, researchers, and data scientists.
  • Translate experimental models into production‑ready deployments.
  • Define best practices for MLOps across the company.

Requirements:

We’re looking for a senior engineer with a strong DevOps/SRE background who has worked extensively with ML systems in production. The ideal candidate brings a combination of infrastructure, automation, and hands‑on MLOps experience.

  • 5+ years in a Senior DevOps, SRE, or MLOps Engineering role supporting production environments.
  • Strong experience designing, building, and maintaining Kubernetes clusters in production.
  • Hands‑on expertise with Terraform (or similar IaC tools) to manage cloud infrastructure.
  • Solid programming skills in Python or Go for building automation, tooling, and ML workflows.
  • Proven experience creating and maintaining CI/CD pipelines (GitLab or Jenkins).
  • Practical experience deploying and supporting ML models in production (e.g., ASR, TTS, NLU, LLM/Agentic AI).
  • Familiarity with ML workflow orchestration tools such as Kubeflow , Apache Airflow , or similar.
  • Experience with experiment tracking and model registry tools (e.g., MLflow , SageMaker Model Registry ).
  • Exposure to deploying models on GPU or specialized hardware (e.g., Inferentia , Trainium ).
  • Solid understanding of cloud infrastructure on AWS , including networking, scaling, storage, and security best practices.
  • Experience with deployment tooling (Docker, Helm) and observability stacks (Prometheus, Grafana).

Ways to Know You’ll Succeed

  • You enjoy building platforms from the ground up and owning technical decisions.
  • You’re comfortable collaborating with ML engineers, researchers, and software teams to turn research into stable production systems.
  • You like solving performance, automation, and reliability challenges in distributed systems.
  • You bring a structured, pragmatic, and scalable approach to infrastructure design.
  • Energetic and proactive individual, with a natural drive to take initiative and move things forward.
  • Enjoys working closely with people – researchers, ML engineers, cloud architects, product teams.
  • Comfortable sharing ideas openly, challenging assumptions, and contributing to technical discussions.
  • Collaborative mindset: you like to build together, not work in isolation.
  • Strong ownership mentality – you enjoy taking responsibility for systems end‑to‑end.
  • Curious, hands‑on, and motivated by solving complex technical challenges.
  • Clear communicator who can translate technical work into practical recommendations.
  • Thrives in a fast‑paced environment where you can experiment, improve, and shape how things are done.

What’s on Offer

  • Competitive fixed compensation based on experience and expertise.
  • Work on cutting‑edge AI systems used globally.
  • Dynamic, multi‑disciplinary teams engaged in digital transformation.
  • Remote‑first work model
  • Long‑term B2B contract
  • 20+ days paid time off
  • Apple gear
  • Training & development budget

Diversity and Inclusion Commitment

We are dedicated to creating and sustaining an inclusive, respectful workplace for all – regardless of gender, ethnicity, or background. We actively encourage applicants from all identities and experience levels to apply and bring your authentic self to our fast‑paced, supportive team.

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    10 Dez 2025
  • Standort:

    WorkFromHome
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!

Diese Jobs passen zu Deiner Suche:

NEXGEN Business Consultants (now part of Cofinpro)
(Senior) Data Platform Architect (w/m/d)
NEXGEN Business Consultants (now part of Cofinpro)
Vollzeit WorkFromHome
26 Nov 2025
Vollzeit Bonn
26 Nov 2025
Arvato Systems
Senior Microsoft D365 / Power Platform Solution Architect (m/w/d)
Arvato Systems
Vollzeit WorkFromHome
26 Nov 2025
ServiceNow
Senior Platform Architect
ServiceNow
Vollzeit München
09 Dez 2025
PPI AG
Senior Cloud Platform Developer / Architect (m/w/d)
PPI AG
Vollzeit WorkFromHome
10 Dez 2025
Evidi
Senior Developer / Co-Architect – Power Platform
Evidi
Vollzeit Remote
16 Sep 2025Development & IT
ServiceNow
Senior Platform Architect
ServiceNow
Vollzeit Frankfurt
06 Okt 2025Development & IT
PPI AG
Senior Cloud Platform Developer / Architect (m/w/d)
PPI AG
Vollzeit Kiel
26 Nov 2025Development & IT