The full guide to Claude Cowork for DevOps and platform engineers covers the architecture, the ROI case, and the onboarding path. This article is different: it's eight specific automations, each with the exact prompt, the tool integrations required, and the expected time saving. Copy, adapt, deploy.
These automations were built on the Claude Cowork platform using the canvas, skills, and MCP connector architecture. They run on Claude Sonnet inside Claude Enterprise — no model configuration required. The integrations listed (PagerDuty, Datadog, Confluence, GitHub) are available as standard Cowork connectors, with custom connectors available through the MCP server development service.
The 8 Cowork Automations for DevOps Teams
End-of-shift handoffs are one of the most common failure points in SRE operations. The outgoing engineer knows what happened; the incoming engineer is reading Slack threads and hoping for the best. This automation produces a structured briefing in under 3 minutes.
Cowork pulls the last 24 hours of PagerDuty alerts, the current open incidents, Datadog service health snapshots, and any open action items from recent post-mortems — then structures them into a briefing document. The incoming on-call engineer reads one document, not eight Slack threads.
Pull the on-call handoff data for the shift ending now. Include: - PagerDuty: all alerts in the last 24 hours, current open incidents, any acknowledged but unresolved alerts - Datadog: current service health for [SERVICE LIST], any monitors in alert or warn state - Recent post-mortems: any action items due or overdue this week Structure the output as: 1. Open Incidents (status, impact, current owner) 2. Alerts to Watch (elevated risk, not yet triggered) 3. Pending Action Items (from last 3 post-mortems) 4. Service Health Summary (green/amber/red per service) 5. Handoff Notes (anything the incoming engineer needs to know that data doesn't capture) Keep it readable in under 5 minutes. Flag anything that requires immediate attention.
Save this as a Cowork skill named "handoff-briefing" and any engineer can invoke it from Slack via Cowork Dispatch with a single message.
When a P1 alert fires at 3am, the first thing your on-call does is search for the relevant runbook — often in Confluence, sometimes in a Slack DM, occasionally in a colleague's memory. This automation links the incoming alert directly to the relevant runbook section and surfaces the diagnostic steps immediately.
Configure Cowork to watch your PagerDuty webhook. When an alert fires matching a defined pattern, Cowork searches your Confluence runbook library, retrieves the relevant section, and posts the diagnostic steps to your incident Slack channel within 90 seconds of the alert.
An alert has fired: [ALERT NAME] on [SERVICE] at [TIME]. Alert details: [PASTE ALERT BODY] Search the runbook library for [SERVICE] and retrieve: 1. The relevant diagnostic checklist for this alert type 2. Known root causes for this alert pattern 3. The escalation path if initial diagnosis doesn't resolve it 4. Links to the last 2 incidents with this same alert Post the diagnostic steps formatted for a Slack message — brief, actionable, with commands ready to copy.
Post-mortem action items have the worst completion rate in engineering. They're written with good intentions, assigned in the heat of the post-mortem meeting, and then quietly disappear into Jira's backlog. This automation produces a weekly accountability report that makes the disappearing act visible.
Every Monday, Cowork reads all post-mortems from the last 90 days in Confluence, extracts action items, checks their Jira ticket status, and produces a report showing which action items are on track, overdue, or blocked. The report posts to your team Slack channel and to the relevant engineering manager.
Read all post-mortems published in Confluence in the last 90 days. Extract every action item with its assigned owner and due date. For each action item: 1. Find the corresponding Jira ticket (search by title or post-mortem reference) 2. Get the current status: Done, In Progress, To Do, or No Ticket Created 3. Calculate if it's on time, due soon, or overdue Produce a report with: - Summary: X of Y action items completed, Z overdue - Overdue items (sorted by how overdue they are) - Items due this week - Items with no Jira ticket created (the accountability gap) Format for a Slack post. Flag any post-mortem that has zero completed action items after 30 days.
Change is the leading cause of incidents. Yet most change review processes are either theatre (rubber-stamping without real analysis) or bottlenecks (waiting for the most experienced engineer to eyeball the PR). This automation produces a structured risk assessment for every deployment candidate.
When a PR is tagged for deployment review, Cowork reads the diff, the service's runbook, and the last 5 incidents for that service. It identifies patterns — "this change modifies the same database migration path that caused the November outage" — and produces a structured risk assessment with specific rollback steps.
Assess the deployment risk for this PR: [LINK OR PASTE DIFF] Service: [SERVICE NAME] Deployment window: [DATE/TIME] Read the service runbook [attached] and the last 5 incident post-mortems [attached]. Produce a risk assessment with: 1. Change summary (what changes, 3 sentences) 2. Risk factors identified (each with: what it is, why it's risky, likelihood, impact) 3. Similar past incidents (from the post-mortems — any pattern matches?) 4. Pre-deployment checklist (service-specific, not generic) 5. Rollback procedure for this specific change 6. Recommended monitoring for 4 hours post-deployment Risk level: Low / Medium / High / Block (with justification). Be specific. "Database schema changes with no migration rollback script" is useful. "This looks risky" is not.
Teams using this automation report that it surfaces at least one material risk per 5 deployments that wasn't caught in the standard PR review. That's one avoided incident per sprint cycle.
Terraform plan output is not documentation. It's a machine-readable diff that tells you what will change — not why it's changing, what the risk is, or how to roll it back. This automation converts a Terraform plan into a change document your CAB can actually review.
Connect Cowork to Terraform Cloud. When a plan is approved for apply, Cowork generates a human-readable change document and publishes it to Confluence as part of your change management record. After the apply, it updates the infrastructure documentation automatically.
Convert this Terraform plan output into a change management document: [PASTE TERRAFORM PLAN OUTPUT] Include: 1. Plain-English summary of what changes (avoid Terraform jargon) 2. Resource inventory: new resources, modified resources, destroyed resources 3. Estimated impact: what systems/services are affected, any downtime expected 4. Rollback plan: terraform state commands and manual steps if needed 5. Dependencies: what other infrastructure does this change depend on or affect 6. Validation steps: how to verify the apply succeeded Format for Confluence. Add a one-paragraph executive summary at the top for stakeholders who won't read the technical detail.
Every time a new engineer joins the on-call rotation for an unfamiliar service, someone spends hours doing a knowledge transfer. This automation assembles the knowledge package automatically, so the new engineer reads it rather than interrogating senior colleagues.
Cowork reads the service's repository, existing runbooks, recent post-mortems, and current Datadog dashboards — then produces a structured onboarding document: what the service does, how it fits the architecture, what breaks and why, and the first-response playbook for the 5 most common incidents.
Produce an on-call onboarding document for [SERVICE NAME] for an engineer who knows our stack but is unfamiliar with this service. Use these sources: - Service repository [attached/linked] - Current runbook [attached] - Last 10 incident post-mortems [attached] - Architecture diagram [attached if available] Structure: 1. What This Service Does (2-3 paragraphs, business context + technical role) 2. Architecture Overview (key components, dependencies, data flow) 3. How to Access and Monitor (dashboard links, log locations, key metrics) 4. The 5 Most Common Incidents (each: symptoms, likely cause, first response steps) 5. Escalation Path (who to call and when) 6. Things That Look Scary But Aren't (common false alarms) 7. Things That Look Fine But Aren't (subtle early warning signs) Aim for a document a new on-call engineer can read in 30 minutes and feel prepared.
Monthly SLO reviews get skipped, delegated to whoever has time, or turned into meaningless traffic-light slides. This automation produces a substantive reliability report from your monitoring data — including error budget analysis, trend identification, and actionable recommendations.
On the first day of each month, Cowork pulls SLO performance data from Datadog for the previous 30 days, reads the incident log, and produces a full SLO review document. It also compares against the previous two months to identify trends. The document goes directly to Confluence and an executive summary posts to your engineering Slack channel.
Produce the monthly SLO review for [MONTH YEAR]. Data sources: - Datadog SLO performance: [paste or attach monthly export] - Incident log for [MONTH]: [attach post-mortem list] - Previous 2 months SLO data: [attach for trend comparison] Report sections: 1. Executive Summary (5 bullets — for VP Engineering, 2-minute read) 2. SLO Performance by Service (table: service, SLO target, actual, delta, error budget remaining) 3. Error Budget Analysis (which services are burning budget faster than planned) 4. Incident Impact (total downtime, P1/P2 breakdown, MTTR by severity) 5. Month-over-Month Trends (improving, degrading, stable — per service) 6. Top 3 Reliability Risks for Next Month (specific, not generic) 7. Recommended Actions (prioritised by impact, with suggested owners) Flag any service approaching error budget exhaustion before the next review period.
You have 47 services. You have runbooks for 12 of them. Of those 12, six haven't been updated since the last major infrastructure migration. This automation surfaces the gaps — specifically, which services had incidents this quarter that had no runbook coverage.
Cowork reads your Confluence runbook directory and your PagerDuty incident history. It cross-references incidents against runbook coverage and produces a prioritised gap list: services with no runbook, services with outdated runbooks (incidents that the runbook didn't address), and the estimated risk level of each gap.
Analyse our runbook coverage and identify gaps. Sources: - Confluence runbook directory [linked]: list all services with runbooks and their last-modified date - PagerDuty incident history — last 6 months [export attached]: all incidents by service For each service that had incidents: 1. Does it have a runbook? (Yes / No / Outdated — last modified before the incident) 2. Did the incident type appear in the runbook? (Yes / No / Partial) 3. How many incidents in 6 months? Produce: - Coverage summary: X of Y incident-prone services have current runbooks - Priority gap list (ranked by: incident frequency × lack of runbook coverage) - Quick-win targets: services that had incidents where a simple runbook would have helped significantly - Stale runbook list: runbooks not updated in 12+ months on services with recent incidents Output as a Confluence page with an action table. Include estimated time to write each missing runbook so we can plan the backlog.
This automation runs quarterly and feeds directly into your Cowork runbook generation workflow — the gap analysis generates the backlog, and the runbook generation workflow clears it.
Building These Automations as Cowork Skills
Each automation above becomes more powerful when saved as a Cowork skill — a reusable, one-command workflow any engineer can invoke. Once a skill is saved, engineers can trigger it from the Cowork interface, from Slack via Cowork Dispatch, or on a schedule.
The recommended skill library for a platform team:
- handoff-briefing — run before every on-call shift change
- deploy-risk — run on every deployment candidate PR
- post-mortem-draft — run within 2 hours of incident resolution
- terraform-docs — run automatically on Terraform plan approval
- service-onboard — run when an engineer joins a new on-call rotation
- slo-review — scheduled for the first of each month
Our Claude Cowork deployment service includes skill library setup as part of the platform team onboarding. We build the skills, configure the connectors, and train your team on invoking and extending them.
Related DevOps Resources
Frequently Asked Questions
Do these automations require a developer to set up?
No. The prompts above work in the standard Cowork canvas without any coding. Connecting MCP integrations to PagerDuty, Datadog, and Confluence does require a one-time configuration step that takes 30–60 minutes. The scheduled automations (monthly SLO review, weekly action item tracker) require configuring Cowork's scheduling feature. Our deployment team handles all of this as part of the Cowork deployment service.
Can we customise these automations for our specific stack?
Yes — the prompts are templates designed to be customised. Replace the bracketed placeholders with your service names, adjust the output format to match your templates, and modify the integration sources to match your toolchain. If you use a monitoring platform other than Datadog (e.g., Prometheus/Grafana, New Relic, Dynatrace), the prompt structure stays the same but the data import changes. For non-standard integrations, our MCP development team can build custom connectors.
What happens when Claude gets something wrong in the automation output?
Cowork's outputs are AI-generated first drafts, not ground truth. The risk assessment and post-mortem automations explicitly flag where they've inferred versus directly read data. The on-call handoff and alert triage automations pull from live data sources and are factual — but the interpretation should always be reviewed by the on-call engineer. We recommend a "verify before acting" policy for any automation output used in incident response.
Can Cowork trigger actions in PagerDuty or Jira, or is it read-only?
By default, Cowork's integrations are read-oriented for incident management tools. Writing to PagerDuty (acknowledging alerts, creating incidents) or Jira (creating tickets, updating status) requires explicit configuration of write permissions. This is intentional — you want human review before Cowork takes action in your incident management system. The write-enabled workflows (like creating Jira action items from post-mortems) are available but configured separately with appropriate approval gates.
Your DevOps Team Shouldn't Be Writing Docs While the On-Call Pager Is Going Off
We deploy Claude Cowork for platform engineering teams — skills library, connector configuration, and team training included.