Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding
Autoregressive generation in large language models traditionally uses the final layer for token prediction, but a new decoding strategy dynamically selects more reliable intermedia…