Most organisations have some form of incident response: for security breaches, outages, or data leaks. Advanced AI introduces new incident types: harmful or biased outputs, inappropriate content, agents taking unintended actions, RAG systems exposing the wrong documents. If you apply generic incident playbooks without adaptation, you’ll miss crucial steps.
DASUD can help structure AI‑specific incident response.
What counts as an AI incident?
Start by defining AI incidents clearly. Examples include:
- Harmful or unsafe outputs E.g., self‑harm encouragement, hate speech, medical or legal advice in prohibited contexts.
- Biased or unfair decisions Systematic disadvantage to certain groups in recommendations or outcomes.
- Privacy or confidentiality leaks RAG surfacing sensitive documents; agents including internal data in external messages.
- Mis‑actions by agents Tickets closed incorrectly, wrong data updated, unintended workflows triggered.
- Behavioural drift Performance or behaviour of models deviating from expectations, leading to new risks.
These should be classified alongside traditional incidents but flagged as AI‑related for analysis and learning.
Design: plan AI incident response scenarios
At the Design stage, define several scenario types and playbooks:
- Content incidents What happens when GenAI produces harmful content? Who can disable certain prompts or features?
- Disclosure incidents What if a RAG assistant exposes confidential content to the wrong user?
- Action incidents What if an agent performs an unintended action in a critical system?
For each, decide:
- How incidents can be reported (by users, staff, monitoring).
- Who triages them first (AI governance, product owners, risk teams).
- What immediate containment looks like (kill switches, feature flags, permission changes).
These decisions need to be made before the first big incident, not during it.
Acquire/Store: logging for investigation
Incident response depends on good logs and context:
- Capture inputs Prompts, queries, and context used at the time of the incident.
- Capture outputs and actions What the system generated or did, including intermediate tool calls for agents.
- Capture environment state Model and configuration versions, relevant feature flags, and access rights at the time.
Store this information in a way that supports:
- Forensic analysis Reconstructing what happened and why.
- Accountability Showing whether oversight mechanisms were followed.
Ensure that access to incident logs is strictly controlled, as they may contain sensitive content.
Use: escalation and communication
During an incident, Use shifts from normal operation to controlled intervention:
- Escalation Have clear severity levels and pathways: who gets notified, and when? Technical leads, AI governance, legal, communications?
- Temporary controls Activate kill switches or degrade functionality: disable certain features, restrict audiences, or revert to a safer system.
- Communication Prepare templates for internal updates and, when needed, external statements. Be ready to explain what went wrong, its impact, and what you’re doing about it.
Align this with your broader incident and crisis management processes, but with AI‑specific details.
Delete: remediation and prevention
Post‑incident, Delete is about removing or altering risky elements:
- Remove or adjust prompts, RAG sources, or tools that contributed to the incident.
- Retire or roll back models or configurations that misbehaved.
- Clear or adjust agent memories if they contain problematic learning.
At the same time, integrate learnings into:
- Design Tighten use‑case scoping, red‑lines, and oversight modes.
- Acquire Improve selection and curation of data, RAG content, or tools.
- Store Update logging and access controls.
- Use Refine safety filters, usage policies, and monitoring thresholds.
Incident response is a feedback loop across DASUD.
Make it concrete
For your organisation:
- Define 3–5 AI incident types relevant to your context.
- Draft a one‑page playbook for each (triggers, triage, containment, communication, follow‑up).
- Review logging and monitoring to ensure you’d actually have the data you need.
- Run a tabletop exercise with a realistic incident scenario.
With AI‑specific incident response tied into DASUD, you’re better prepared for the day when something does go wrong—which, at scale, is inevitable.
If you’d like assistance or advice with your Data Governance implementation, or any other topic (Privacy, Cybersecurity, Ethics, AI and Product Management) please feel free to drop me an email here and I will endeavour to get back to you as soon as possible. Alternatively, you can reach out to me on LinkedIn and I will get back to you within the same day!