Paper Presentation: Red Teaming Language Models with Language Models, Presented by Shreyash Kale

In this reading group, Shreyash Kale presents an interesting paper where a multi-agent interaction between two LLMs tries to elicit untoward behavior from one of the models. The paper is called Red Teaming Language Models with Language Models by Perez et al.