Blog
Dados & Embeddings
TallyTrain: Communication-Efficient Federated Distillation
arXiv:2607.00173v1 Announce Type: new Abstract: Federated learning is bandwidth-bound on two orthogonal axes: model size, which limits how often parameter-averaging methods can afford to merge, and class count, which makes per-probe soft-label distillation prohibitive at large vocabularies. Both ceilings tighten as modern systems scale. We collapse the class-count axis to $\lceil \log_2 C \rceil$ bits per probe by transmitting only each peer's $\arg\max$ class index, where $C$ is the number of o...
arXiv cs.LG
·Radhakrishna Achanta, Will Reed
·
// relacionados
Leia também
Blog
A Filtered Mixture-of-Generators for Fully Synthetic Survival Training
Blog
PHREEQC-MCQ-200: A Diagnostic Benchmark for Tool-Augmented Scientific Simulator Agents
Blog
Device Passport: Enabling Spatio-Temporal Pretrained Models to Generalize Across Input Layouts
Blog