Conversational Domain Adaptation of IndicTrans2 across 21 Indic Languages via Experience Replay and Model Soups
arXiv:2606.29024v1 Announce Type: new Abstract: IndicTrans2 is the strongest open English to Indic translation system, but like most systems it is trained on general text and tends to sound stiff on casual, conversational input. We adapt IndicTrans2-1B to conversational register across all 21 Indic languages using only public data (OpenSubtitles, BPCC-H-Daily, Tatoeba). Plain fine-tuning improves conversational chrF but forgets the general domain (it drops 3.9 chrF on FLORES for Hindi). Mixing g...
arXiv cs.CL
·Aditya Pratap Singh
·
// relacionados
Leia também
Modelo
nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16
Blog
OpenClaw is finally available on Android and iOS
Blog
Claude Science is Anthropic’s newest flagship product
Blog