Blog Dados & Embeddings

TallyTrain: Communication-Efficient Federated Distillation

arXiv:2607.00173v1 Announce Type: new Abstract: Federated learning is bandwidth-bound on two orthogonal axes: model size, which limits how often parameter-averaging methods can afford to merge, and class count, which makes per-probe soft-label distillation prohibitive at large vocabularies. Both ceilings tighten as modern systems scale. We collapse the class-count axis to $\lceil \log_2 C \rceil$ bits per probe by transmitting only each peer's $\arg\max$ class index, where $C$ is the number of o...

arXiv cs.LG ·Radhakrishna Achanta, Will Reed · 02 de janeiro de 2026

Ver no Hugging Face

// relacionados

TallyTrain: Communication-Efficient Federated Distillation

Leia também

A Filtered Mixture-of-Generators for Fully Synthetic Survival Training

PHREEQC-MCQ-200: A Diagnostic Benchmark for Tool-Augmented Scientific Simulator Agents

Device Passport: Enabling Spatio-Temporal Pretrained Models to Generalize Across Input Layouts

Leveraging Phase Information to Boost Unrolled Network Learning for Image Deblurring