// radar de ia

Dados & Embeddings

Papers, modelos e datasets em alta no Hugging Face, além do blog oficial — com leitura editorial em português.

OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmark
Blog LLMs & Texto

OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmark

OpenAI is expanding its Daybreak cybersecurity initiative with an updated Codex Security plugin, the full GPT-5.5-Cyber model, and a partner network with more than 25 security firms and several governments. The focus shifts from finding vulnerabilities to patching them automatically. The article OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmark appeared first on The Decoder .

23.06.2026
Top spy agencies say AI cyber threats will impact you within months. Here’s why
Blog LLMs & Texto

Top spy agencies say AI cyber threats will impact you within months. Here’s why

The global surge in AI cyber threats is no longer a distant problem for corporate data centres, according to an urgent public warning from the world’s most powerful intelligence alliance. On June 22, 2026, the cybersecurity chiefs of the Five Eyes nations—comprising the US, UK, Canada, Australia, and New Zealand—issued a rare joint intelligence briefing stating that upcoming artificial […] The post Top spy agencies say AI cyber threats will impact you within months. Here’s why appeared...

23.06.2026
Blog LLMs & Texto

GLM-5.2 OpenAI-Compatible API: A Hands-On Guide to Reasoning Effort, Function Calling, and Long-Context Retrieval

We build a practical GLM-5.2 workflow using its hosted, OpenAI-compatible API instead of running the model locally. We set up multiple providers, load the API key securely, and create a reusable chat wrapper. We then test thinking-effort control, streamed reasoning, function calling, a tool-using agent, structured JSON output, and long-context retrieval. We close with token and cost accounting so every demo stays measurable. The post GLM-5.2 OpenAI-Compatible API: A Hands-On Guide to Reasoning E...

23.06.2026
Blog LLMs & Texto

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices

arXiv:2606.20571v1 Announce Type: new Abstract: In agent-driven question answering (QA) applications, retrieval-augmented generation (RAG) is commonly introduced to enhance the response accuracy of large language models (LLMs) by providing additional context. Due to the inherent noise in retrieval results and the coarse granularity of document-level retrieval, the retrieved context often contains substantial redundant information. In this setting, the agent prompt, consisting of the user query a...

23.06.2026
Blog Dados & Embeddings

XmoPipe: A Pipeline for Large-Scale In-the-Wild Human Motion Dataset Construction

arXiv:2606.20731v1 Announce Type: new Abstract: Large-scale human motion datasets are essential for training robust motion models for analysis, synthesis, and understanding. While marker-based motion capture provides precise data, it is costly and limited in scale and diversity. Recent advances in monocular motion capture and video-language understanding open the way to extract plausible motion from unconstrained online videos. We present a scalable pipeline for constructing in-the-wild human mo...

23.06.2026
Blog Dados & Embeddings

Bridging Multi-Valued Heuristics and Dimensionality Reduction in Multi-Objective Search

arXiv:2606.20644v1 Announce Type: new Abstract: Multi-objective shortest-path (MOSP) algorithms traditionally rely on single-valued heuristics (SVHs), which associate each state with a single admissible cost vector. While SVHs provide safe lower bounds, they fail to capture the trade-off structure of the Pareto frontier and often yield weak search guidance. Multi-valued heuristics (MVHs) address this limitation by mapping states to sets of cost estimates, enabling a richer approximation of possi...

23.06.2026
Blog LLMs & Texto

The Metanym Game: A Self-Contained, Self-Consistent LLM Peer-Community Benchmark for Structural Intelligence

arXiv:2606.21008v1 Announce Type: new Abstract: The metanym game is a competitive word game for LLMs that measures structural intelligence against established cognitive-science constructs. No content is given in advance; the contestants create all of it -- a new kind of analogy test, analogical production falsifiable sentence by sentence, with no fixed test set to leak into training (contamination-resistant by construction). In the council-of-peers benchmark, the contestants also rate each other...

23.06.2026
Blog LLMs & Texto

GRAG: Generic Response-Augmented Generation Framework for Personalized Conversational Systems

arXiv:2606.21097v1 Announce Type: new Abstract: Deploying highly capable personalized conversational agents in resource-constrained or privacy-sensitive environments remains a significant challenge. We identify a fundamental bottleneck in the existing approaches: current training paradigms treat personalization and grounding as a single monolithic learning problem. Under these paradigms, language models are forced to simultaneously address what to say (content grounding) and how to say it in a u...

23.06.2026
Blog Robótica & RL

Real-World Deployment of Massively Parallel Sampling-Based MPC for Contact-Rich Manipulation

arXiv:2606.20712v1 Announce Type: new Abstract: Sampling-based Model Predictive Control (SMPC) is a promising strategy for contact-rich robotic manipulation, combining gradient-free optimization with massively parallel GPU simulation. Yet, most prior work relies on simplified dynamics or remains confined to simulation. We present an MPC framework that leverages JAX for large-scale parallelization and efficient computation, coupled with the high-fidelity MuJoCo MJX simulator, and deploy it on a F...

23.06.2026
Blog LLMs & Texto

Short-Term Electricity Demand Forecasting for New England Using a Hybrid Transformer-XGBoost Framework with Weather, Calendar, and COVID-19 Indicators

arXiv:2606.20918v1 Announce Type: new Abstract: Accurate short-term electricity demand forecasting is critical for reliable power system operation, energy market planning, and infrastructure optimization. This paper presents a hybrid framework combining a Transformer encoder for temporal feature extraction with gradient-boosted decision trees (XGBoost) for daily electricity demand forecasting across New England. The framework integrates meteorological observations from six cities spanning all si...

23.06.2026
Blog Robótica & RL

R2HandoverSim: A Simulation Framework and Benchmark for Robot-to-Human Object Handovers

arXiv:2606.21011v1 Announce Type: new Abstract: We present R2HandoverSim, a simulation benchmark for robot-to-human (R2H) object handovers. Although R2H handover methods have advanced rapidly, the lack of standardized evaluation protocols impedes objective comparison. Our benchmark enables reproducible evaluation by systematically comparing four baselines on their predicted shared grasp poses. We conduct a user study with 30 participants, analyze baseline performance, and show that simulation re...

23.06.2026
1 / 10 próxima →
119 itens no radar