Dirk Hovy
Home
Research
Publications
Projects
Talks
CV
Fun
Blog
Leatherwork
Contact
refusal
No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models
Large language models (LLMs) are increasingly integrated into our daily lives and personalized. However, LLM personalization might also increase unintended side effects. Recent work suggests that persona prompting can lead models to falsely refuse …
Cite
×