
<!--tl&dr-->
**TL;DR:** A structured AI agent for coordinating real-time incident response across your entire stack — with built-in rollback triggers, stakeholder dashboards, and audit trails.
<!--/tl&dr-->
This skill implements a tier-ARCHITECT incident management protocol with five distinct phases:
1. **Detection** — Monitors your alerting pipeline and fires on-threshold events from your APM
2. **Triage** — Runs root-cause analysis using your system's existing runbooks
3. **Coordination** — Assigns tasks, updates status pages, pings on-call via webhooks
4. **Resolution** — Executes predefined rollback playbooks
5. **Review** — Generates a postmortem with timeline reconstruction
npm install @mrtech/incident-commander
{
"incidentCommander": {
"sources": [
{ "type": "pagerduty", "apiKey": "$PD_API_KEY" },
{ "type": "prometheus", "url": "http://prometheus:9090" }
]
}
}
npx incident-commander drill --scenario=memory-leak --target=api-service-07
Run an incident drill for a memory leak scenario targeting api-service-07.
Assume full prod access. Generate a postmortem markdown in /tmp/postmortem.md.
| Dimension | Rating | Notes |
|---|---|---|
| **Speed** | 4/5 | Sub-5-minute mean time to respond |
| **Coverage** | 5/5 | Handles 90%+ of standard SRE playbook scenarios |
|---|
| **Integrations** | 4/5 | Supports major APM tools, webhook-flexible |
|---|
| **Learning Curve** | 3/5 | Requires some YAML config |
|---|
| **Cost** | 4/5 | Open source core |
|---|