TEKsystems

Operations Expertise— Storage (All Genders)

Stellenbeschreibung:

Job Title

Operations Expertise— Storage (All Genders)

Location

Frankfurt am Main, Germany

Job Type

Contract

Description

Local Operations manages the on‑premises production platform which serves as the primary host for all mission‑critical business applications. Local operations are responsible for the following core areas:

  • Platform Stability: ensuring high availability and performance of the on‑premises private cloud environment.
  • Application Hosting: consulting on the seamless operation of Germany‑specific productive business applications.
  • Incident Management: resolving technical issues within standard business hours to minimize operational downtime.
  • Lifecycle Maintenance: executing routine updates, patches, and system optimizations within the local infrastructure.

Objective

Provide Tier‑3 operational ownership for Storage Products for Local Production (DE).

Scope of Work

  • Ensuring operational readiness for storage changes.
  • Monitoring and alerting coverage, performance baselines, hardening, patch strategy, rollback and recovery procedures, and runbooks.
  • Executing and improving standard operational procedures through automation (reduce toil, improve MTTR and stability).
  • Automation of standard operational tasks (capacity checks, validation procedures, provisioning workflows where applicable).

Support and Operational Readiness

  • Validate deployment artifacts from an operations perspective.
  • Define and enforce quality assurance measures, such as required documentation of standard operation procedures and successful test reports, to ensure high quality of delivered products and services.
  • Ensure rollback strategies and operational monitoring (observability) are in place for production deployments.

Monitoring, Incident, Problem and Change Management

  • Monitor system health, performance metrics, and service availability across multi‑tenant environments.
  • Identify, analyze, and resolve incidents to minimize service disruption.
  • Trigger root cause analysis and implement corrective and preventive actions.

Automation of Operations Critical Standard Processes

  • Address operational issues by automating remedial standard operations processes.
  • Validate all automated procedures following the established software development lifecycle, including staging, testing, and validation reviews.

Security and Compliance Enforcement

  • Implement monitoring and logging strategies to support audit and compliance requirements.
  • Perform routine security scans and remediate identified vulnerabilities.

Qualifications

Must-have experience

  • 5+ years in IT storage operations / service delivery / platform operations with demonstrated leadership in mission‑critical environments.
  • Proven experience implementing or leading Incident, Problem, Change, Release governance in production.
  • Experience supporting platform workloads that rely on shared storage services.
  • Expertise with storage types: File, Block, and Object Storage.
  • Expertise with protocols/services: NFS; object storage operations (S3‑like concepts).
  • Experience with Kubernetes storage integration: CSI driver concepts and troubleshooting (PV/PVC lifecycle understanding).
  • Virtualization (storage): experience operating storage virtualization in enterprise environments.
  • Expertise within ITSM: Jira Service Management, Jira, Confluence.
  • Fundamental understanding of core operations processes (incident, change, problem management, ITSM) as well as SRE concepts.
  • Experience gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
  • Hands‑on experience documenting procedures and enforcing clear runbooks or playbooks.
  • Observability hands‑on experience with monitoring and logging tools (Prometheus, Grafana, Datadog, Mimir, Loki).
  • Familiarity with enterprise DevOps toolchains is a plus (GitLab, JFrog Artifactory, Backstage, Harness).
  • Strong understanding of modern platform operations (Kubernetes/containers, automation, observability), sufficient to govern specialists.
  • Platform delivery concepts: GitOps and IaC awareness (Terraform/OpenTofu, ArgoCD, Helm) to govern deployment/readiness standards.

Preferred experience

  • Experience operating in regulated / high‑availability industries (banking, public sector, healthcare).
  • Experience with SRE practices (SLOs/SLIs, error budgets) and reliability management.
  • Experience operating storage services that integrate with Kubernetes platforms.
  • Familiarity with IaC‑based provisioning and GitOps‑driven operational patterns.

Language skills

  • Proficiency in both speech and writing in English (at least C1).

Key Skills

  • Object storage
  • NFS
  • Kubernetes
  • Change management
  • ITSM

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    17 Apr 2026
  • Standort:

    Frankfurt
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!