Radar de IA — Notícias, Modelos e Papers

OpenBioRQ: Unsolved Biomedical Research Questions for Agents

A new biomedical benchmark evaluates agentic models' ability to verify sources and avoid false citations by testing unsolved research questions with no answer keys, revealing signi…

20.06.2026 ·▲ 3

Paper LLMs & Texto

Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding

Autoregressive generation in large language models traditionally uses the final layer for token prediction, but a new decoding strategy dynamically selects more reliable intermedia…

20.06.2026 ·▲ 9

Paper Multimodal

BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language

BioMatrix is a novel multimodal foundation model that integrates molecular sequences, structures, and natural language into a unified decoder-only architecture for diverse biologic…

20.06.2026 ·▲ 17

Paper Visão Computacional

Multi4D: High-Fidelity Dynamic Gaussian Splatting via Multi-Level Competitive Allocation

Multi4D addresses the trade-off between motion consistency and visual fidelity in dynamic 3D Gaussian splatting through a multi-level competitive allocation framework that enables…

20.06.2026

Paper Robótica & RL

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

EBench is a comprehensive simulation benchmark for evaluating generalist mobile manipulation policies across diverse tasks and dimensions, revealing distinct capability profiles an…

20.06.2026 ·▲ 13

Paper LLMs & Texto

A Verifiable Search Is Not a Learnable Chain-of-Thought

Training models on chain-of-thought demonstrations fails for tasks requiring backtracking search because the forward derivation cannot be faithfully imitated, demonstrating a funda…

20.06.2026 ·▲ 2

Modelo Áudio & Voz

owensong/Inflect-Nano-v1

Modelo de síntese de voz — 0 downloads e 207 curtidas no Hugging Face.

19.06.2026

Blog Robótica & RL

From PGP to Mythos: a brief history of export controls that didn’t stop anyone

For the last 30 years, stopping the flow of cybersecurity-related software has proven to be ineffective. It's unclear why it would work now with Anthropic’s cybersecurity model Mythos.

19.06.2026

Modelo LLMs & Texto

WeiboAI/VibeThinker-3B

Modelo de geração de texto · 3 B de parâmetros — 67.8 mil downloads e 756 curtidas no Hugging Face.

19.06.2026 ·↓ 67777

Dataset LLMs & Texto

armand0e/claude-fable-5-claude-code

Dataset em destaque no Hugging Face — 13.0 mil downloads. claude-fable-5 Agent Traces It's worth noting that our team was working with Glint-Research to collect as much fable data as possible.

19.06.2026 ·↓ 12969

Blog LLMs & Texto

Is the US government’s Anthropic ban accidentally helping the brand?

Just as last week was ending, the US government forced Anthropic to pull its two newest models, Fable 5 and Mythos 5, citing national security concerns after Amazon researchers allegedly found a way to bypass Fable 5’s guardrails. Cybersecurity researchers have since signed an open letter calling the move dangerous, and Anthropic itself noted the same jailbreaks exist in other models. So is […]

19.06.2026

Blog LLMs & Texto

The US banned Anthropic’s Fable 5 release, but the numbers don’t seem to care

Just as last week was ending, the US government forced Anthropic to pull its two newest models, Fable 5 and Mythos 5, citing national security concerns after Amazon researchers allegedly found a way to bypass Fable 5’s guardrails. Cybersecurity researchers have since signed an open letter calling the move dangerous, and Anthropic itself noted the same jailbreaks exist in other models. So is […]

19.06.2026

O que está acontecendo agora

OpenBioRQ: Unsolved Biomedical Research Questions for Agents

Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding

BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language

Multi4D: High-Fidelity Dynamic Gaussian Splatting via Multi-Level Competitive Allocation

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

A Verifiable Search Is Not a Learnable Chain-of-Thought

owensong/Inflect-Nano-v1

From PGP to Mythos: a brief history of export controls that didn’t stop anyone

WeiboAI/VibeThinker-3B

armand0e/claude-fable-5-claude-code

Is the US government’s Anthropic ban accidentally helping the brand?

The US banned Anthropic’s Fable 5 release, but the numbers don’t seem to care