Building a Multimodal Dataset of Academic Paper for Keyword Extraction
arXiv:2606.31069v1 Announce Type: new Abstract: Up to this point, keyword extraction task typically relies solely on textual data. Neglecting visual details and audio features from image and audio modalities leads to deficiencies in information richness and overlooks potential correlations, thereby constraining the model's ability to learn representations of the data and the accuracy of model predictions. Furthermore, the currently available multimodal datasets for keyword extraction task are pa...
arXiv cs.CL
·Jingyu Zhang, Xinyi Yan, Yi Xiang, Yingyi Zhang, Chengzhi Zhang
·
// relacionados
Leia também
Blog
SpaceX has an AI device prototype, and it sure sounds phone-ish
Blog
Ashton Kutcher leaving Sound Ventures to launch new VC firm with Morgan Beller
Blog
Gated Multi-Graph Fusion via Graph Attention Networks for Alzheimer's Disease Detection
Blog