← back to glossary

promptingfoundations

Jailbreak

A technique for manipulating an AI model into ignoring its safety guidelines or producing content it was trained to refuse, typically through carefully crafted prompts.

Last updated 2026-05-12