This paper is a super nice representation of which types of antagonistic prompts work well to make the model give bad responses.
https://arxiv.org/pdf/2209.07858.pdf