Blog
Dados & Embeddings
A Single Stepsize Suffices for Unprojected Linear TD(0): Simultaneous Robust and Fast Rates via Polyak--Ruppert Averaging
arXiv:2606.24981v1 Announce Type: new Abstract: We study linear TD(0) under Markovian sampling, where data are generated along a single trajectory. We provide high-probability guarantees for a plain unprojected TD(0) algorithm with Polyak-Ruppert (PR) averaging, using a single stepsize schedule $\eta_t \propto \frac{1}{\tau_{\mathrm{mix}}\log(t)\sqrt{t}}$ that depends on the mixing time but requires no prior knowledge of the curvature parameter $\omega$. Our first result shows that such a choice...
arXiv cs.LG
·Wei-Cheng Lee, Francesco Orabona
·
// relacionados
Leia também
Blog
Qualcomm enters the data center market with its own processor
Blog
LEVIRDet: A Million-Scale 159-Category Dataset and Foundation Model for Universal Remote Sensing Object Detection
Blog
To Isolate or to Score? Model-Adaptive Assessment for Cost-Efficient Multi-Agent RAG
Blog