“Emergent Misalignment” in LLMs
February 28 2025Interesting research: “Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs“:
Abstract: We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to...
Read more
Recent Comments