Blog
Áudio & Voz
Graph-Based Phonetic Error Correction of Noisy ASR
arXiv:2606.24889v1 Announce Type: new Abstract: Automatic speech recognition (ASR) systems, despite low overall word error rates, produce residual lexical errors that disproportionately affect semantically critical tokens such as named entities, negations, and sentiment-bearing words. These errors are often structured, arising from phonetic similarity rather than random noise, making naive token-level correction insufficient. We propose a structured ASR correction framework, that we call G-SPIN,...
arXiv cs.CL
·Pratik Rakesh Singh, Mohammadi Zaki, Aneesh Mukkamala, Pankaj Wasnik
·
// relacionados
Leia também
Editorial
LTX-2: o primeiro modelo fundacional de vídeo e áudio em conjunto — aberto, com 19B de parâmetros
Blog
How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scoring
Blog
Noise-Aware Boundary-Enhanced Generative Learning for Ultrasound Speckle Reduction
Blog