AI safety evaluation checklist

Safety evaluation asks how an AI system fails, how quickly the team can detect it, and what controls prevent user harm.

Failure modes

List unsafe outputs, over-reliance, automation bias, misuse, security attacks, and edge cases.

Run scenario tests, adversarial prompts or inputs, human review audits, and regression suites.

Define severity, shutdown criteria, user notification, remediation, and post-incident review.