Refusal in Language Models Is Mediated by a Single Direction 117 points by fagnerbrack 5 days ago 45 comments story