ilert

AI Product Engineer (LLM Agents & SRE Automation) (f/m/x)

ilert WorkFromHome

Stellenbeschreibung:

Team: Product & Engineering • Reports to the CTO

Location: Hybrid - Cologne (Rheinauhafen) - 3 days in office, 2 days remote (Tue and Thu)

Shape the future of autonomous incident response

We’re on a mission to make downtime invisible. Thousands of DevOps and SRE teams rely on ilert to detect, resolve, and communicate incidents faster.

As our first AI Product Engineer , you’ll build the core of ilert’s AI-first strategy: autonomous, tool-using agents that diagnose alerts, run root cause analysis, execute safe mitigations, and keep services healthy.

This is a hands-on role where you’ll turn operational expertise and product insight into real, reliable AI systems used in production.

Tasks

Design & Build AI Agents

  • Design agent reasoning loops, prompts, and safety constraints.
  • Build multi-step tool-using agents (logs, metrics, traces, k8s, Git, CI/CD, cloud APIs).
  • Implement autonomy flows: investigation → analysis → mitigation → validation.

Ship Product Features

  • Work with product and engineering to build AI-backed features that solve real customer problems.
  • Translate complex SRE workflows into intuitive user experiences powered by AI.
  • Own features end-to-end (design → prototype → implementation → rollout).

Integrate with Observability & Ops Tooling

  • Connect LLM agents to Grafana, Prometheus, Kubernetes, GitHub, CI/CD, cloud services, etc.
  • Design safe tool schemas and APIs for autonomous execution.

Ensure Reliability, Safety & Determinism

  • Build guardrails for safe, reversible mitigations.
  • Validate model output with structured schemas (e.g., Zod, JSON schema).
  • Establish evaluation suites, test harnesses, and monitoring for agent performance.

Collaborate Across Teams

  • Work with SREs to encode operational expertise into agents.
  • Partner with Product to shape requirements and roadmap decisions.
  • Influence ilert’s broader AI strategy.

Requirements

Must-Have Skills

  • Experience building AI-powered applications with LLMs (OpenAI, Anthropic, etc.)
  • Strong prompt engineering & agent design skills
  • Experience implementing multi-step tool-use flows
  • Solid software engineering fundamentals (preferably Rust)
  • Experience integrating with APIs, backend services, or automations
  • Ability to reason about reliability, safety, and controlled automation
  • Product mindset: able to turn ambiguous problems into shippable solutions

Nice-to-Have Skills

  • Background in SRE, DevOps, or incident response
  • Experience with observability tools (Grafana, Prometheus, Elastic, Datadog, New Relic)
  • Hands-on Kubernetes knowledge
  • Experience with production agent frameworks (ReAct, LangChain, LangGraph, custom state machines)

Soft Skills

  • You love building real products, not demos
  • Strong communication & critical thinking
  • Comfortable working with high autonomy and ownership
  • Passion for reliability, automation, and removing toil

Benefits

  • Build one of the first real autonomous SRE agents in the industry
  • Product-centric culture: Be part of a team that’s 100% committed to solving a critical issue for businesses that offer round-the-clock services.
  • Hybrid Work Environment: Enjoy the best of both worlds with in-person collaboration and remote work flexibility.
  • No Meetings #hackfwd: Maximize productivity by keeping meetings to a minimum and focusing on your core responsibilities.
  • High impact, high ownership role. Your work ships to customers quickly
  • Small, senior team with fast decision-making
  • Modern tech stack + strong engineering culture
  • Direct involvement in shaping the future of on‑call and incident responseFounder‑led startup

Please include one link (GitHub, repo, notebook, or demo) that best showcases your experience building AI-powered or agentic systems.

#J-18808-Ljbffr
NOTE / HINWEIS:
EnglishEN: Please refer to Fuchsjobs for the source of your application
DeutschDE: Bitte erwähne Fuchsjobs, als Quelle Deiner Bewerbung

Stelleninformationen

  • Veröffentlichungsdatum:

    12 Dez 2025
  • Standort:

    WorkFromHome
  • Typ:

    Vollzeit
  • Arbeitsmodell:

    Vor Ort
  • Kategorie:

  • Erfahrung:

    2+ years
  • Arbeitsverhältnis:

    Angestellt

KI Suchagent

AI job search

Möchtest über ähnliche Jobs informiert werden? Dann beauftrage jetzt den Fuchsjobs KI Suchagenten!

Diese Jobs passen zu Deiner Suche:

company logo
IT Service Techniker (m/w/d)
Engineering und IT
Vollzeit Pinneberg
20 Dez 2025Development & IT
Autohero
Mobiler KFZ-Trainer / KFZ Meister - Schwerpunkt Mechanik (d/m/w)
Autohero
Vollzeit Magdeburg
20 Dez 2025
Autohero
KFZ-Trainer - Schwerpunkt Bodyshop/Lackiererei (d/m/w)
Autohero
Vollzeit Potsdam
20 Dez 2025
Autohero
KFZ-Trainer - Schwerpunkt Bodyshop / Lackiererei (d/m/w)
Autohero
Vollzeit Hemau
20 Dez 2025
Autohero
Mobiler KFZ-Trainer / Lackierer / Meister für Lackiererei (d/m/w)
Autohero
Vollzeit Leipzig
20 Dez 2025
Autohero
KFZ-Trainer & Ausbilder - Schwerpunkt Mechanik (d/m/w)
Autohero
Vollzeit Regensburg
20 Dez 2025
Autohero
KFZ-Trainer & Ausbilder - Schwerpunkt Bodyshop/Lackiererei (d/m/w)
Autohero
Vollzeit Potsdam
20 Dez 2025
Autohero
KFZ-Trainer & Ausbilder - Schwerpunkt Bodyshop / Lackiererei (d/m/w)
Autohero
Vollzeit Regensburg
20 Dez 2025