Customer Reliability Engineer
We’re looking for a Customer Reliability Engineer to own technical outcomes from POC to production for strategic customers.
TL;DR
Land value fast and keep it running – embed with customers to design, build, operationalize deployments of our platform and leave behind required resources so customers feel confident and self‑sufficient.
Why this role exists
Design, build, and operationalize platform deployments for customers; give customers confidence and self‑sufficiency.
Why deepset
At deepset we are on a mission to make custom AI solutions accessible to every organization. With Haystack thousands of developers build advanced LLM applications every day, while our enterprise‑ready AI Platform helps companies turn large language models into business value. Remote‑first, flexible, built on culture of trust and ownership.
What you will do
Design & Land
- Own technical outcomes from POC → production: integrations, data connectors, workflows, and infra-as-code (Kubernetes/Terraform/Helm).
- Produce reference architectures and reusable templates; upstream patterns to Product to reduce future “custom” work.
- Unblock enterprise environments: identity (OIDC/SAML), networking, storage, GPU scheduling, observability hooks.
- Define SLOs/Error Budgets with customers; implement end‑to‑end observability (logs/metrics/traces) and dashboards.
- Create runbooks/playbooks; lead L3 incident response and RCAs; drive reliability roadmaps to closure.
- Plan/execute upgrades and security patches in change windows; ensure rollback and post‑upgrade verification.
- Be an active member of the on‑call rotation to make sure we deliver excellent customer experience.
Partner & Enable
- Train customer teams on operations and emergency procedures; hand off cleanly to Support/CSM.
- Prioritize reliability and “productization” backlog with Product/Engineering based on field signal.
- Document clearly: setup guides, diagrams, SLOs, testing/DR procedures, and “golden path” standards.
Requirements
- 5+ years across SRE/Platform/Solutions/FDE, with evidence of shipping customer‑facing builds and operating production systems.
- Hands‑on experience in Python for improvements, bug fixing, and small feature builds.
- Strong with Kubernetes, containers, Linux, IaC (Terraform/Helm), CI/CD, networking (TLS, DNS, ingress/LB), backup/restore.
- Observability stacks (Prometheus/Grafana/OpenTelemetry/ELK); scripting (Python/Bash).
- Enterprise integration experience (SSO, secrets, compliance); confident communicator with execs and engineers under time pressure.
- Must be resident of the European Union with an EU Passport.
Nice to have
- Remote‑first setup with flexible hours & tech of your choice.
- 30 days vacation + extra days for family sick leave.
- Competitive salary & stock options for every team member.
- Monthly sports & mental health support allowance with Oliva.
- Annual learning & development budget.
- Monthly team socials & in‑person meetups.
- Dog‑friendly Berlin HQ.
About us
Founded in 2018, deepset builds open and enterprise‑grade tools that help teams build AI with purpose. From Haystack, our open‑source framework, to the deepset AI Platform, we give developers and organizations the building blocks to solve complex, high impact challenges with AI – with full control, transparency, and sovereignty. Backed by GV and Balderton, we’re growing the world’s production AI community and customer base solving challenges too critical to get wrong.
Visit us to learn more
deepset Website | Haystack Website | GitHub | Linkedin | X deepset (Twitter) | X haystack (Twitter)
#J-18808-Ljbffr