Cut incident MTTR without adding another dashboard to your on-call rotation.
Built for the 2am page. Multi-cluster search, event replay, shareable incident links — none of which require you to find a kubeconfig first.
An incident starts. Your first five minutes decide everything.
You get paged. You're groggy. You've got a terminal, a laptop balanced on your knees, and a sinking feeling that you don't remember which cluster the payments service runs in.
Your first kubectl command takes 40 seconds because you're on the wrong context. By the time you get the right one, the pod has restarted. The events are rolling off. The logs have rotated. Somebody in Slack is asking for an update.
The real MTTR killer isn't the fix. It's the five minutes of kubectl roulette before you can even see what's wrong. That's the time Radar gives back.
Your runbook, before and after.
Same incident. Same human. Different tool.
Open the Slack incident channel. Click the Radar link in the alert.
Find your VPN. SSH to jump host. Switch kubeconfig. Remember which cluster.
See the affected pods, recent events, and upstream dependencies in one view.
kubectl get pods --all-namespaces | grep ... (in 4 different clusters).
Rewind the timeline 10 minutes. See the ArgoCD sync that broke it. Click to the commit.
Check ArgoCD UI. Check GitHub. Check kubectl events (which are already gone).
One-click Helm rollback from Radar. Confirm the topology goes green.
Find the right values file. Run helm rollback. Watch kubectl and hope.
Share the timeline link with the incident channel. Snooze related alerts from Slack.
Screenshot. Paste. Explain. Answer the same 4 questions from 3 people.
The timeline is still there. Export it as CSV. Attach to the post-mortem doc.
Events are gone. Reconstruct from logs. Hope someone took screenshots.
Composite walkthrough based on incident retros from early design partners. Your mileage will vary. The kubectl roulette will not.
Six features. All optimized for the way incidents really happen.
Cross-cluster search
One search bar across every cluster. Paste a pod name. Jump to the state, events, and logs in seconds — no kubeconfig hunting.
Event timeline replay
Rewind any cluster to any moment in the last 30 days to 1 year. Correlate OOMKills and image pulls with what users were seeing.
Shareable incident links
Drop a link in #incident-response. Everyone sees the same topology, the same filtered timeline, the same resource state. Handoffs in seconds.
Smart alert correlation
12 pods OOMKill in 30 seconds? Get one Slack message with 12 entries, not 12 messages. Snooze from Slack. Route to the right team.
Traffic + mTLS visibility
Is the payments service actually talking to the fraud service? See east-west traffic live, with error rates, latency, and TLS cert health.
GitOps incident correlation
ArgoCD synced 4 minutes before the alert. Radar shows you the commit, the diff, and the live state — in the same timeline.
We page you in the tools you already answer in.
Radar doesn't replace your alerting. It adds the Kubernetes-layer truth your existing alerts are missing.
The numbers SRE teams track when they roll out Radar.
These are the metrics early design partners have reported. Not promises — baselines to watch.
Faster time-to-root-cause in incident retros
Reduction in Slack questions during incidents
New on-call engineers productive in days, not months
Questions SREs ask.
Does Radar replace my observability stack?
What about my existing alerting?
Can I trigger rollbacks from Radar?
Does it work with our existing RBAC?
What happens if Radar is down?
Your next incident is going to happen. Be ready for it.
Apache 2.0 OSS or hosted free for 3 clusters. Install the agent in 60 seconds. Have a timeline before your next page fires.
Apache 2.0 OSS · Unlimited clusters self-hosted · Hosted free tier for up to 3 clusters