Senior Site Reliability Engineer - Neocloud Provider

Stellenbeschreibung:

Do you want to join a leading next-generation AI cloud provider as a Senior Site Reliability Engineer?

You will be joining a Neocloud that is building one of the most advanced GPU and high-performance computing platforms in Europe.

The role offers the chance to help design and maintain the reliability, scale and performance of a growing cloud platform with real engineering challenges.

You will collaborate with highly skilled teams across software, hardware, networking & AI infrastructure, with the autonomy to influence technical direction and build systems that support large-scale compute workloads.

If you are interested in this opportunity and want to learn more, get in touch today.

Responsibilities

  • Architect and maintain reliable, fault-tolerant, large-scale distributed systems for high-performance GPU and compute workloads.
  • Build and automate deployment, failover, monitoring, capacity planning, and incident-response workflows.
  • Develop, optimise, and maintain CI/CD pipelines to enable safe, rapid, and repeatable software delivery.
  • Drive incident response and root-cause analysis while improving system observability, performance, and long-term stability.
  • Partner with backend, hardware, and networking teams to optimise service performance, support regional expansion, scale compute clusters, and participate in on-call rotations.

Required Skills & Experience

  • Strong Linux debugging expertise, including network and system-call tracing.
  • Proficiency with Terraform and Kubernetes (network policies, scheduling, taints/tolerations).
  • Experience with Slurm job monitoring and core configuration.
  • Solid Python or Go skills, covering async/error handling, environment management, and common system/HTTP tooling.
  • Ability to automate workflows and troubleshoot distributed systems using CLI tools, logs, and scripting.

Salary & Benefits

  • Up to €130,000 Gross Per Year
  • Bonus Scheme
  • Company share scheme

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    15 Dez 2025
  • Standort:

    WorkFromHome
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!

Diese Jobs passen zu Deiner Suche:

adesso SE
Cloud Data Engineer AWS / Google Cloud (all genders)
adesso SE
partner ad:img
Vollzeit Augsburg
18 Dez 2025Development & IT
Murtfeldt GmbH & Co. KG
Senior Full-Stack Webentwickler (m/w/d)
Murtfeldt GmbH & Co. KG
partner ad:img
Vollzeit Dortmund
18 Dez 2025Development & IT
Cofinpro AG
Scrum Master / Release Train Engineer (w/m/d)
Cofinpro AG
partner ad:img
Vollzeit Frankfurt am Main
18 Dez 2025Development & IT
PROLOGA GmbH
Senior Consultant SAP Utilities (m/w/d) - remote
PROLOGA GmbH
partner ad:img
Vollzeit Oldenburg
18 Dez 2025Development & IT
PROLOGA GmbH
SAP Software Engineer (m/w/d) SAP - remote
PROLOGA GmbH
partner ad:img
Vollzeit Bundesweit
18 Dez 2025Development & IT
PROLOGA GmbH
Senior SAP Software Developer - Circular Economy (m/w/d) - remote
PROLOGA GmbH
partner ad:img
Vollzeit Bundesweit
18 Dez 2025Development & IT
PROLOGA GmbH
Senior SAP Berater (m/w/d) - remote
PROLOGA GmbH
partner ad:img
Vollzeit Bundesweit
18 Dez 2025Development & IT
PROLOGA GmbH
Senior Software Developer Circular Economy (m/w/d) - remote
PROLOGA GmbH
partner ad:img
Vollzeit Bundesweit
18 Dez 2025Development & IT