CAVEWOMAN: How Large Language Models Behave Under Linguistic Input and Output Compression

arXiv:2606.24083v1 Announce Type: new Abstract: "Talk short. Drop grammar. Save token." This caveman style is widely promoted as a way to cut inference cost, but whether it actually saves anything depends on which channel (the user's prompt or the model's response) is being compressed. We present Cavewoman, a two-channel evaluation protocol that scores every generation on task accuracy, realized per-item cost, and reference-text agreement against the model's unconstrained reference. We evaluate ...

arXiv cs.CL ·Morayo Danielle Adeyemi, Ryan A. Rossi, Franck Dernoncourt ·
compartilhar: