| Surface | Question it answers | Lifetime |
|---|---|---|
| Issues | What’s broken right now? | Live - recomputed each poll, gone when it clears |
| Alerts (this page) | What tripped my rules - open now, and what resolved? | Durable - persists across the open → resolved lifecycle |
| Notifications | How am I told, and through which channels? | Per-delivery (inbox row, Slack message, webhook POST) |
Notification rules
A rule is a filter over the Issues stream plus delivery settings. Create them at Settings → Organization → Notifications → Issue notification rules (owner-only), or straight from the Issues page with Notify on these - which pre-fills a rule from whatever filters you’ve got applied.What a rule matches
Every rule narrows the stream by any combination of:- Clusters - specific clusters, or all of them.
- Namespaces - with an opt-in for cluster-scoped issues (Nodes, PVs, …), which don’t belong to a namespace.
- Severity -
critical,warning, or both. - Detection source -
problem,missing_ref,scheduling,condition(see the Issues catalog). - Category / category group - the symptom axis (e.g. GitOps render failed, Image pull failed).
- Resource kind - Deployment, Application, HelmRelease, …
- CEL - an expression for anything the structured filters don’t cover. Values within one dimension are OR’d, dimensions AND together, and CEL ANDs on top.
Delivery and timing
| Setting | What it controls |
|---|---|
| In-app inbox | On by default - every rule drops matched alerts into the bell tray for free. |
| Destinations | Slack channels and webhooks the rule routes to (see Notifications). |
| Minimum duration | Wait this long before alerting, so a blip that self-heals never pages anyone. |
| Resolve grace | Wait this long after an issue clears before calling the alert resolved (default 2 minutes), so a flapping resource doesn’t open/close repeatedly. |
| Re-notify interval | Optionally re-send while an alert stays open, so a long-running problem doesn’t fall silent. Off by default. |
| Notify on resolve | Send a follow-up when the alert clears (on by default), so “it’s fixed” closes the loop. |
Baseline silence - no alert storm on day one
The single most important behavior: alerts only fire for issues that appear after the rule exists. When you create a rule - or connect a brand-new cluster - everything already broken at that moment is recorded as baseline and silently seeded as already-open. None of it notifies. That existing backlog is the Issues view’s job; alerts are for new problems from here forward. Without this, turning on a rule against a cluster with 40 standing issues would fire 40 notifications at once. Baseline alerts still appear on the Alerts page (marked baseline) so the history is complete - they just never sent a notification.The alert lifecycle
- An issue is detected and matches a rule.
- It persists past the rule’s Minimum duration - the alert opens and sends a notification.
- While it stays open past the Re-notify interval (if set), Radar re-sends so it doesn’t fall silent.
- The issue clears and stays clear past the Resolve grace - the alert resolves and (if Notify on resolve is on) sends a follow-up.
The Alerts page
Open Alerts in the left rail (/fleet/alerts). It reads the durable records, so a resolved alert still has somewhere to live after it’s gone from Issues.
- Window - open and resolving alerts are always shown; resolved alerts are bounded to the last 7 days so the page stays a recent-history view, not an unbounded archive.
- Status filter -
All/Open/Resolved. - Scope - respects the global cluster scope picker, like every fleet view.
- Each row shows the resource, cluster, namespace, category, and the rule that caught it. Open rows carry a View live in Issues link; resolved rows show how long the alert fired (
fired for 3m). Baseline (seeded) rows are de-emphasized. - Access - any org member can read the Alerts page and history. Writing rules is owner-only.
Moving between surfaces
The surfaces are deliberately cross-linked so a notification never dead-ends:- An “issue opened” notification deep-links to the live Issues queue with that row focused and highlighted - it’s still broken, so the live view is where you act.
- An “issue resolved” notification deep-links to the Alerts page with that alert focused - it’s gone from Issues, so Alerts is its only home. The link hydrates the specific alert even if it’s outside the 7-day window.
- If you click an “opened” notification after the issue resolved, the Issues page recognizes the gap and bridges you to Alerts rather than showing an empty list.
Suppressing alerts
Alerts honor Issues triage. When someone dismisses an issue in the Issues queue (an org-wide action, with a reason), its alerts are suppressed too - you won’t get re-notified for a problem the team has consciously set aside. Restoring the issue re-arms its alerts.See also
- Issues - the live detection stream alerts are built on.
- Notifications - channels, the inbox, email preferences, and webhook payloads.
- Audit log - the underlying org event taxonomy.